Data Gravity and Its Effect on Cloud Migration and Repatriation

data gravity, latency, and throughputWe experience the effects of gravity every day. It’s a force of nature that is produced by everything in existence, from the Earth and the sun, to your car and your cat. Even you—yes, you—produce gravity. But it wasn’t until mathematician Sir Isaac Newton got bonked on the head by a falling apple that gravity became understood as a real concept. Known as the “laws of gravity,” they were formulated and published by Newton in 1686.

 

The Laws of Gravity

What is gravity? In simple terms, Newton’s laws state that every object in the universe attracts every other object in the universe; the force of the attraction depends on the mass of the object (you’re actually being “pulled” toward your computer or mobile device while you’re reading this right now, but because it is so small compared to everything else in the universe, you don’t feel a thing). Newton’s laws also state that the greater the distance between two objects, the less the objects will attract one another. So, the farther away an object is from the Earth, for example, the less it will weigh because nothing is “pulling it.” This is why astronauts float in space; they’re not close enough to any large mass to have significant weight.

 

Data Gravity as a Concept

Building off of the general laws of gravity (there was a reason for that brief science refresher!), software developer Dave McCrory coined the term “data gravity” in 2010. McCrory said that to understand data, it is necessary to view it as if it were an object with sufficient mass. As data accumulates, or builds mass, its gravitational pull increases and additional services and applications become attracted to it, often making use of it. So is data gravity a good thing—or a bad thing? It depends on who you ask. Ultimately, it’s probably a little bit of both.

 

The Pros of Data Gravity

To drive home the concept of data gravity and highlight its benefits, McCrory created an illustration similar to the one below:

Data Gravity Latency Throughput

This illustration demonstrates how network characteristics like latency (the time required to perform an action or produce a result) and throughput (the number of actions executed per unit of time) act as accelerators, pulling services and applications together faster as data increases in size. And, when services and applications are closer to their data (i.e., with the same cloud services provider), latency is lowered, and throughput is raised. It’s a win-win that makes for more useful and reliable services and applications.

 

The Cons of Data Gravity

Back to the physical world; a mobile device producing very little gravity is easy to move around; we carry them with us every day. Now, how about moving a mountain? A much greater task! Data mobility follows the same principle. A gigabyte is easy to move around; a terabyte is a bit more difficult, yet still manageable. Now think about hundreds of terabytes, or even a petabyte. Suddenly, the task is not so easy. As data accumulates following the data gravity concept, it becomes “heavier”, which makes it harder and harder to move from one storage location to another. In addition, the further data is moved apart, the more difficult it becomes to manage because latency will begin to increase, and throughput will begin to decrease. This means that at a certain capacity, it becomes extremely hard or simply too expensive to move data.


 

How Data Gravity Affects Cloud Migration and Repatriation

Today, we are producing more data than ever before. Each day, over 2.5 quintillion bytes of data are created, and this number is only increasing due to the growth of new technologies like the Internet of Things (IoT). In fact, 90% of the world’s data was generated in the last two years alone. This massive amount of data has been a major catalyst for the rapid adoption of the public cloud. Organizations that once comfortably housed their data on-site or in a private cloud, simply could not maintain capacity, or afford the costs associated with growing their infrastructure. It’s been a boon for data centers, of course, but also for organizations; in the cloud, they could scale in a moment’s notice without having to purchase new equipment. Additionally, the cloud allows them to take advantage of economies of scale to lower their costs.

Unfortunately, housing massive amounts of data with a public cloud provider can create some challenges. If an organization decides to move some or all instances from a public cloud back to its own premises, or to another cloud provider (otherwise known as cloud repatriation), the sheer size of their workloads can make the process complex. To put it bluntly, data gravity can leave them feeling stuck—tied to their data center and its specific means of storage.

 

Cloud Migration and Repatriation Through DSM

While data gravity can make cloud migration and repatriation a bit more complex, there are multiple strategies that can be employed to move data over time, or to at least get mission-critical workloads moved as quickly as possible without having to compromise resiliency, data protection, compliance, or performance. We recently covered these techniques, and the processes for each, which you can read about here:

Want to speak with an IT expert about moving your data, and the intricacies involved in migrating to the cloud or repatriating from a public cloud? Contact DSM, the predictable cloud provider, to discuss your unique situation.

Repatriating Workloads Back to On-Premise | DSM eBook

Related posts