In this page

What is load balancing
How to choose
Application tuning

Quick links

Linksys NSLU2
Load Balancing
A cheap UPS for ALIX
RTC on Capacitor for ALIX
GuruPlug Server Plus
Mirabox vs GuruPlug
Line rate HTTP Server
Elastic Binary Trees


Making applications scalable with Load Balancing

Why this article

As the author of the HAProxy Load Balancer, I'm often questionned about Load Balancing architectures or choices between load balancers. Since there is no obvious response, it's worth enumerating the pros and cons of several solutions, as well as a few traps to avoid. I hope to complete this article soon with a deeper HTTP analysis and with architecture examples. You can contact me for further information. This document is now available as a PDF file :


Originally, the Web was mostly static contents, quickly delivered to a few users which spent most of their time reading between rare clicks. Now, we see real applications which hold users for tens of minutes or hours, with little content to read between clicks and a lot of work performed on the servers. The users often visit the same sites, which they know perfectly and don't spend much time reading. They expect immediate delivery while they inconsciously inflict huge loads on the servers at every single click. This new dynamics has developped a new need for high performance and permanent availability.

What is load balancing ?

Since the power of any server is finite, a web application must be able to run on multiple servers to accept an ever increasing number of users. This is called scaling. Scalability is not really a problem for intranet applications since the number of users has little chances to increase. However, on internet portals, the load continuously increases with the availability of broadband Internet accesses. The site's maintainer has to find ways to spread the load on several servers, either via internal mechanisms included in the application server, via external components, or via architectural redesign. Load balancing is the ability to make several servers participate in the same service and do the same work. As the number of servers grows, the risk of a failure anywhere increases and must be addressed. The ability to maintain unaffected service during any predefined number of simultaneous failures is called high availability. It is often mandatory with load balancing, which is the reason why people often confuse those two concepts. However, certain load balancing techniques do not provide high availability and are dangerous.

Load balancing techniques

1. DNS

The easiest way to perform load balancing is to dedicate servers to predefined groups of users. This is easy on Intranet servers, but not really on internet servers. On common approach relies on DNS roundrobin. If a DNS server has several entries for a given hostname, it will return all of them in a rotating order. This way, various users will see different addresses for the same name and will be able to reach different servers. This is very commonly used for multi-site load balancing, but it requires that the application is not impacted to the lack of server context. For this reason, this is generally used to by search engines, POP servers, or to deliver static contents. This method does not provide any means of availability. It requires additional measures to permanently check the servers status and switch a failed server's IP to another server. For this reason, it is generally used as a complementary solution, not as a primary one.

$ host -t a has address has address has address

$ host -t a has address has address has address

$ host -t a has address has address has address has address has address has address has address has address has address has address has address
Fig.1 : DNS roundrobin : multiple addresses assigned to a same name, returned in a rotating order

[ 1/10 ] >>>