There are two fundamental methods for scaling server infrastructure. They can be scaled up, or they can be scaled out. In the former, sites are progressively moved from lower spec servers to higher. Typically a site or group of sites might move from shared hosting to a low-powered dedicated server and then to successively more powerful servers. Scaling up is also frequently referred to as vertical scaling.
The second method, scaling out or horizontal scaling is different. Imagine a group of sites hosted on a relatively low-powered server. As the sites grow in popularity and their traffic increases, eventually the server will reach peak capacity. Load beyond this point will cause the server to falter: usually it’ll be the RAM that gives out first. The server worker processes will multiply to the point at which there is no more memory available: the server will start queueing requests and swapping out memory to disk. This is a bad situation for a server to be in because it significantly degrades site performance.
If the site is set up for horizontal scaling, this situation need not arise. Instead of moving the whole shebang to another server, we can simply duplicate the server we have and put both the old and new server behind a load balancer. The load balancer receives requests and decides which of the two servers should respond, apportioning load as appropriate.
You can probably see the obvious benefit of horizontal scaling: as load increases, we can simply add another server, then another, and so on. Server clusters aren’t quite that simple to manage, but it’s a lot more economical and efficient to scale horizontally than it is to scale vertically.
If you’re familiar with the cloud, you’ll have already understood that cloud servers are the perfect components in a load-balanced cluster. The cloud is inherently scalable. If a cluster is under heavy load, it’s almost as simple as pressing a button to spin up another server behind the load balancer. Even better, it can be automated: server monitoring technology can keep an eye on load and performance, automatically launching another cloud server when the time is right. This is the basic model used by a company like Netflix, which relies heavily on rapidly scalable infrastructure.
And of course, load-balanced cloud servers have another major advantage: users pay only for the resources they use, which means that their costs track their traffic and hence their revenues — if a site isn’t attracting traffic, the site owners aren’t paying for unneeded capacity.
Cloud-based load-balanced clusters provide hosting clients with a smooth scaling path so that their infrastructure can grow without interruption as audiences increase, mitigating the risk of becoming overloaded.