Load balancing client connections to application servers is crucial for the application experience. Over the years that load balancers have been deployed, a number of algorithms have been invented and optimized to deliver improved performance.
Load Balancing Algorithms
Each load-balancing algorithm delivers the core requirement of managing incoming client requests and routing them to an application instance running on a pool of back-end servers. Applications are diverse, and the optimal method for load-balancing connections to them will depend on the infrastructure in place and the organization's requirements.
Here we outline eleven algorithms used to load balance connections to application servers and highlight some use cases for each.
Round Robin load balancing
Round robin is the most straightforward load balancing algorithm. Client requests get distributed to application servers in simple rotation. For example, if you have three application servers: the first client request goes to the first application server in the list, the second client request to the second application server, the third client request to the third application server, the fourth to the first application server, and so on.
Round robin load balancing is most appropriate for predictable client requests that get spread across a server pool whose members have relatively equal processing capabilities and available resources (such as network bandwidth and storage).
The simplicity of round robin load balancing is its most significant advantage and disadvantage. As it's so simple, the setup and ongoing management are quick. However, this algorithm doesn't take anything about the state of the application servers or their load into account, which can become problematic very quickly in real-world scenarios.
A use case for round robin load balancing is to test that a newly deployed load balancer and application server pool can communicate, and that basic networking and load balancing capabilities are working.
Weighted Round Robin load balancing
Weighted round robin is similar to the round robin algorithm, but it adds the ability to spread the incoming client requests across the server pool according to the relative capacity of each server. It is most appropriate for spreading incoming client requests across servers with varying capabilities or available resources. The administrator assigns a weight to each application server based on criteria of their choosing that indicates the relative traffic-handling capability of each server in the pool.
The servers assigned a higher weight get allocated a higher percentage of the incoming requests. In a three-server pool, if Server 1 is twice as powerful as Servers 2 & 3, then the weighing could be set like this:
Server | Weighting | Percentage of Requests Allocated |
1 | 10 | 50% |
2 | 5 | 25% |
3 | 5 | 25% |
The scale used for the weighting doesn't matter. It's just the differences in weighting between servers that matters.
A use case for weighted round robin load balancing is to do A/B testing of application changes by deploying updates to a server with a low weighing first to see how the changes perform. Administrators can deploy the application updates to other servers if no issues arise. The weighting is adjusted to control the rollout until all servers are updated, and the production weighting can be used.
Least Connections load balancing
Least connections load balancing is a dynamic load balancing algorithm where client requests are distributed to the application server with the least number of active connections when the client request is received. In cases where application servers have similar specifications, one server may get overloaded due to longer-lived connections. This algorithm considers the dynamic connection load and doesn't send requests to servers that cannot handle them.
A use case for least connections load balancing is when incoming requests have varying connection times and a set of relatively similar servers in processing power and resources are available. If clients can maintain connections for an extended period, there is a possibility that a single server will end up with all its capacity used by multiple connections like this. Using the least connections algorithm will mitigate this risk.
Weighted Least Connections load balancing
Weighted least connection extends the least connection algorithm to account for differing application server characteristics. The administrator assigns a weight to each application server based on relative processing power and available resources. Load balancing decisions get based on active connections and the assigned server weights (e.g., if there are two servers with the lowest number of connections, the server with the highest weight is preferred).
Use cases for weighted least connections load balancing will be similar to basic least connections but applicable to server pools where the servers are not all the same. It will provide the same protection against persistent connections going to a single server or the most powerful server in a mixed pool.
Resource Based (Adaptive) load balancing
Resource-based (or adaptive) load balancing makes decisions based on status indicators retrieved by the load balancer from the application servers. The status gets determined by an agent running on each server. LoadMaster queries each server regularly for this status information and then appropriately sets a dynamic weight for each server. This load-balancing method is essentially performing a detailed health check on the application servers.
This method is appropriate when detailed health check information from each server is required to make load-balancing decisions. For example, this method would be useful for any application where the workload varies, and detailed application performance and status are required to assess server health. This method can also provide application-aware health checking for Layer 4 (UDP) services.
Fixed Weighting load balancing
In the fixed weighting load balancing algorithm, a weight gets assigned to each application server based on criteria representing each server's relative traffic-handling capability. The application server with the highest weight will receive all of the traffic. If the application server with the highest weight isn't available to handle more connections, the load balancer will direct all traffic to the next highest-weight application server.
This method is appropriate for workloads where a single server can handle all expected incoming requests, with one or more "hot spare" servers available to pick up the load should the currently active server fail.
Weighted Response Time load balancing
The weighted response time algorithm uses an application server's response time to calculate a server weight. The application server that is responding the fastest receives the next request. A use case for weighted response time load balancing is where rapid application response time is the paramount concern.
Source IP Hash load balancing
The source IP hash load balancing algorithm uses the client's source and destination IP addresses to generate a unique hash key to tie the client to a particular server. As the key can be regenerated if the session disconnects, this allows reconnection requests to get redirected to the same server used previously. This is called server affinity. This load balancing method is most appropriate when a client must always return to the same server on each successive connection, like in shopping cart scenarios where items placed in a cart on one server should be there when a user connects later.
URL Hash load balancing
The URL hash load balancing algorithm is similar to the source IP hashing, except that the hash created is based on the URL in the client request. This ensures that any client requests to a particular URL always go to the same back-end server. A typical use case would be to direct traffic to an optimized media server that can play video or an optimized server for a particular task.
DNS Load Balancing
DNS load balancing is a commonly used technique for load balancing in simple scenarios and also for distributing traffic across multiple data centers, possibly in different geographic regions.
Instead of having dedicated hardware for resolving which server to redirect to, a DNS name server is responsible for managing traffic. The name server has a list of IP addresses corresponding to the various servers where requests can be routed. Every time someone queries for a particular domain, the name server returns this list of IP addresses and changes the order of the addresses. The reordering happens in a round-robin fashion. Thus, every new request gets routed to a different machine in a round-robin order, and the load gets distributed among the servers.
Using DNS load balancing via a name server is simple and popular in many deployments. But it doesn't factor in the existing load on the servers, the processing time of the server, their online status, and more. It's basically a form of round robin that doesn't require a dedicated load balancer. Other limitations occur when DNS records are cached on clients and servers, meaning that the random shuffling that would spread the load isn't very effective.
Geographic load balancing via DNS across separated data centers is a common and non-trivial use case for this technique. In this scenario, it adds value, whereas using it to avoid deploying a load balancer for local use is a false economy.
Some Frequently Asked Questions about Load Balancing Algorithms
Many Progress Kemp customers ask us what load balancing algorithm to use, and which is most common. We outline some guidance on these questions below. But as we outline, each organization and deployment is unique. So, if you are unsure which LoadMaster algorithm to use, talk to our consultancy team, who can advise you based on their experience from 100,000+ deployments and client engagements.
Which load balancing algorithm is most common?
Unsurprisingly, the simple static Round Robin algorithm is the most common. As it's so simple to set up, it is often used to test whether a load balancer and server pool are communicating. In many simple deployments, after this initial setup, it remains the default until more dynamic algorithms are required. The weighted version of round robin often gets used when the backend servers are not identical, but needs are still simple.
As traffic levels increase and the needs get more complex dynamic algorithms like least connection, adaptive algorithms, and weighted response time are popular. The IP Hash and URL Hash algorithms are common for scenarios where affinity is needed (shopping carts or dedicated media servers).
How to choose the right load balancing algorithm?
Many factors will influence the choice of which load balancing algorithm to choose. The base choice should be to deploy dedicated load balancers rather than rely on DNS-based load balancing.
Once that choice is made (irrespective of whether the load balancer is a hardware device, a virtual machine instance, or running in the cloud), the choice of which algorithms to use will depend on the applications being managed. Static or dynamic algorithms may deliver what is needed. A good rule of thumb is to start simple and only move to more dynamic and complex algorithms as needs dictate.
What load balancing algorithm should I use?
Use the algorithm that delivers the needed application experience with the lowest overhead. If the traffic patterns and load on the infrastructure are well known, then using simple algorithms with a properly sized application server pool may be sufficient. If load levels are not predictable, or if available infrastructure dictates a server pool with varying resource levels, then the more dynamic algorithms that take into account the state of the servers and the network would be more appropriate and deliver the best application experience.
When deciding which algorithm to use, remember that the load balancer may host other functionality such as proxy services, TLS/SSL encryption offloading, or Global Server Load Balancing (GSLB). The resources available to the load balancer should be able to support the core load balancing algorithms in use, plus any other services running on that instance.
The Kemp LoadMaster consultancy team is always available to assist you in making the right choices and getting the best user application experience from your available infrastructure.
Maurice McMullin
Maurice McMullin was a Principal Product Marketing Manager at Progress Kemp.