1. Library
  2. Computer Networks
  3. Servers and Infrastructure
  4. Load Balancing

Updated 8 hours ago

Every popular service faces the same problem: one server isn't enough. It can only handle so many requests before it slows down, falls over, or catches fire (metaphorically, usually).

A load balancer is the solution—and also an admission. It says: no single server can do this alone. Let's distribute the work.

What a Load Balancer Actually Does

A load balancer sits between users and your servers. Users connect to the load balancer's address. The load balancer forwards each request to one of many backend servers, spreading the work across all of them.

From the user's perspective, they're talking to one service. Behind the scenes, dozens or hundreds of servers share the load. The user doesn't know. The user doesn't care. That's the point.

This enables horizontal scaling—adding more servers when you need more capacity. The alternative is vertical scaling (bigger, more powerful servers), which hits physical limits and costs exponentially more. With load balancing, handling 10x the traffic means adding more servers, not buying a supercomputer.

Why This Matters

Scalability is the obvious benefit. Traffic spikes? Add servers. Black Friday? Add servers. Viral moment? Add servers. The load balancer distributes work across whatever you give it.

High availability is the subtler benefit. Servers fail. Hardware dies. Software crashes. When one server behind a load balancer fails, the load balancer routes around it. Users never notice. The remaining servers handle the traffic while you fix (or replace) the failed one.

Here's the paradox: by adding a component (the load balancer), you've reduced single points of failure. A single server is itself a single point of failure—and not a good one. A load balancer in front of many servers means any individual server can fail without taking down the service.

Performance improves because work distributes evenly. Without load balancing, some servers sit idle while others drown. With it, everyone shares the burden.

Maintenance becomes possible without downtime. Need to update a server? Pull it from rotation, update it, put it back. Users experience nothing.

How the Load Balancer Decides

When a request arrives, the load balancer must choose: which server gets this one?

Health checks are continuous. The load balancer pings each server regularly—sometimes a simple "are you alive?" sometimes a full HTTP request to a health endpoint. Servers that stop responding get pulled from rotation automatically. When they recover, they're added back. No human intervention required.

Distribution algorithms determine the pattern:

  • Round-robin sends requests to servers in sequence: A, B, C, A, B, C...
  • Least connections sends to whichever server currently has the fewest active requests
  • Weighted distribution sends more traffic to more powerful servers
  • Random is exactly what it sounds like (and works surprisingly well)

Session persistence (or "sticky sessions") keeps a user talking to the same server across multiple requests. This matters when servers hold state—shopping carts, login sessions, in-progress forms. The load balancer uses cookies or IP addresses to route returning users to their original server.

Layer 4 vs. Layer 7

Load balancers operate at different levels of the network stack:

Layer 4 load balancers work with IP addresses and TCP/UDP ports. They're fast because they don't inspect the actual content—they just forward packets. But they can't make decisions based on what's in the request.

Layer 7 load balancers understand HTTP. They can route based on URL paths (/api goes to these servers, /images goes to those), HTTP headers, cookies, or request content. More powerful, but more processing overhead.

Most modern web applications use Layer 7 load balancing for the routing flexibility. Layer 4 is common for non-HTTP protocols or when raw performance matters most.

The Limitations

Load balancers solve distribution, not every scaling problem.

Stateful applications fight against load balancing. If your application stores user state on individual servers, you need session persistence—which means some servers end up with more load than others, defeating the purpose. The real solution is making your application stateless, storing session data in shared storage like Redis.

Database writes don't load balance well. You can distribute read queries across many database replicas. But writes typically must go to a single primary database to maintain consistency. This is a different problem requiring different solutions.

The load balancer itself can become a bottleneck or single point of failure. Production systems run multiple load balancers with their own failover. It's load balancers all the way down—until you hit DNS-based distribution at the top.

Common Patterns

Web applications put load balancers in front of web servers, distributing HTTP requests across the fleet.

Database read scaling uses load balancers to spread read queries across replicas while directing writes to the primary.

SSL/TLS termination happens at the load balancer. It handles the encryption handshake and forwards unencrypted traffic to backend servers. This offloads cryptographic work and simplifies certificate management.

Microservices use load balancing everywhere. Each service runs multiple instances behind its own load balancer. Service-to-service communication goes through load balancers. The architecture is load balancers talking to load balancers.

Implementation Options

Software load balancers like Nginx and HAProxy run on standard servers. Flexible, cost-effective, and powerful enough for most workloads.

Cloud load balancers (AWS ELB, Google Cloud Load Balancing, Azure Load Balancer) are managed services. They handle scaling, redundancy, and configuration automatically. More expensive per request, but less operational burden.

Hardware load balancers (F5, Citrix) are dedicated appliances for extreme performance requirements. Expensive, but they can handle traffic volumes that would require many software load balancers.

For most applications starting out, a cloud load balancer is the right choice—it scales with you and requires minimal expertise. As you grow, the economics might favor software load balancers you control.

Frequently Asked Questions About Load Balancers

Was this page helpful?

😔
🤨
😃