For some of us, stepping onto the scale is a morning ritual. Put on a few pounds, and the scale goes up; drop a few, and the scale goes down. Simple, right? Well, scalability in the cloud is a bit like that. It refers to a system in which every piece of infrastructure can be expanded to handle increased workloads, or it can be reduced if the workload shrinks. This flexibility enables organizations to quickly adapt as they grow or decline, improving efficiency and performance.
Scalability: A Real-World Example
Let’s say you’re a small startup company doing decent, steady business. Then, you get the opportunity to feature your product on the popular television show Shark Tank, which averages about 7 million viewers per episode. Suddenly, your website is getting thousands or millions of hits, with some of those users trying to place orders (this happens more often than you might think). While the exposure is great for business, can your infrastructure handle the influx of traffic? If you have a scalable web application in the cloud, it shouldn’t be a problem. If you don’t, your site could be slow to load (never a good thing; according to Kissmetrics, 40% of visitors will leave a website if the loading process takes more than 3 seconds) or it could crash completely. Now, rather than attracting new customers, they’re completely turned off.
Scalability In and Out of the Cloud
Scaling on-premise infrastructure is no easy feat, requiring peak capacity planning and proper hardware and software configuration. It also requires a large asset purchase, usually paid in a lump sum, which depreciates over time. While once considered a necessary evil, this costly capital expense (CAPEX) would often be viewed unfavorably by those in the finance department. In addition, should the organization need to scale down, they’d be left with an unused, and very expensive piece of equipment.
Scalability becomes much easier in a cloud-based infrastructure; in fact, it’s one of the cloud’s greatest strengths and is often cited as one of the main reasons for rapid migration, along with factors such as security and cost benefits. With the cloud, there’s no need to purchase additional servers, or find the real estate to house them; instead, organizations can simply ask their provider to allocate the additional resources needed to expand the cloud environment. This type of transaction is also considered an operational expense (OPEX), similar to an increase in a monthly payment.
Of course, if an organization decides to downsize, closes a branch, or loses some clients, they can simply ask their provider to scale down so that they’re not paying for unused capacity.
Types of Scaling in the Cloud
There are three ways to scale that organizations should be aware of. Oftentimes, this happens behind the scenes with the cloud provider, but it’s a good idea to understand each.
Vertical Scaling (Scaling Up)
This method of scaling entails adding more power to an existing instance. This could take the form of adding more memory (RAM), more powerful processors (CPUs), or faster storage. No code is changed and no new infrastructure is added, however vertical scaling is finite, and eventually the server limit will be reached. At this point, the application will need to be powered down in order to be resized, so eventual maintenance and the downtime that comes with it should be planned for in advance.
Horizontal Scaling (Scaling Out)
This method of scaling entails the acquisition of new infrastructure and connecting it to existing infrastructure so that they work together seamlessly. Complex architectural design is necessary to build an application capable of complete horizontal scaling, and it can be a time-consuming and labor-intensive process. However, when it is possible, organizations can use an infinite number of instances for limitless growth potential.
Not surprisingly, diagonal scaling then is a combination of vertical and horizontal scaling. With this method, organizations “get the best of both worlds.” They scale vertically until the existing infrastructure has hit capacity, then clone that server and add more resources in a new server, until the process repeats.
Auto Scaling and Load Balancing
Based on user-defined conditions, auto scaling increases or decreases capacity as needed. This is a service that’s offered by many cloud providers, and ensures that the proper number of instances are always available to handle an application’s load. It works by having a cloud provider set specific milestones which trigger the creation of a new instance or the expansion of an existing one.
Load balancing works in much that same way. While it doesn’t create or expand instances, load balancers distribute workloads and computing resources across servers to equalize used and unused resources. This improves overall availability, helping organizations achieve higher performance levels and potentially lower costs.
According to Forbes magazine, “60% of businesses say they’ve set aside funds for additional storage space to help manage the immense amount of data being created every day within their markets.” And it’s no surprise; there are 2.5 quintillion bytes of data created each day, and 90% of it was generated in the last two years alone. With greater amounts of data being generated, the need for organizations to scale up or out quickly will become imperative. Thankfully, cloud computing makes it possible.
Have concerns over your organization’s ability to scale? Speak with the IT experts at DSM, Florida’s predictable cloud provider, today. We can ensure that when it’s time to grow, you’re ready to go.