Updated 2 hours ago
DNS load balancing is a hack—an elegant one. You're repurposing a system designed to answer "where is this?" into one that answers "where should this go?" Every limitation you'll encounter traces back to this distinction.
The Domain Name System maps names to addresses. It wasn't built to make routing decisions. When you configure multiple IP addresses for a single domain, DNS returns them all. Clients pick one. Traffic spreads across your servers. It works—but like a hammer driving screws. Sometimes exactly right, but you should understand what you're asking the tool to do.
The Mechanism
DNS load balancing works by configuring multiple A records (or AAAA for IPv6) for a single domain. When clients query, the DNS server responds with multiple IP addresses. The client connects to one—typically the first. By varying which address appears first, DNS distributes traffic across your server pool.
This happens transparently. No additional hardware sits between users and servers. The DNS layer you already have becomes your load balancer.
The critical constraint emerges immediately: caching. Recursive resolvers and client systems cache responses according to Time-to-Live values—typically minutes to hours. During this window, all requests from that client flow to the same server regardless of load or availability.
The caching that makes DNS fast is the same thing that makes DNS load balancing crude. You can't escape this tradeoff. You can only choose where on the spectrum you want to sit.
Round-Robin Distribution
Round-robin is the simplest technique. Your authoritative server maintains multiple IP addresses and rotates their order. One client receives 192.0.2.1, 192.0.2.2, 192.0.2.3. The next receives 192.0.2.2, 192.0.2.3, 192.0.2.1. Since most clients connect to the first address, traffic spreads across servers over time.
The appeal is radical simplicity. Create multiple DNS records. Done.
The limitations are equally stark. Round-robin knows nothing about your servers. A machine at twice capacity gets the same traffic share as one struggling under load. A server in Sydney receives connections from Stockholm while a Swedish server idles. When a server fails, DNS keeps directing traffic to it—DNS doesn't know the server is dead. It returns the address until someone notices and removes the record.
TTL values create an impossible choice. Low TTLs (60 seconds or less) reduce caching but increase query volume and slow connections. High TTLs improve performance but make distribution uneven and prolong outages. No setting solves both problems.
Weighted Records
Weighted DNS adds proportional distribution. Instead of treating all addresses equally, you assign weights that control how often each appears.
Server A with weight 100, Server B with weight 50, Server C with weight 25: DNS returns A's address twice as often as B's, four times as often as C's. Traffic distribution can reflect actual server capacity.
Cloud providers—Route 53, Cloudflare, Google Cloud DNS—support weighted routing natively. This proves valuable during migrations: gradually shift traffic from old infrastructure to new by adjusting weights. No abrupt cutover.
But weighted DNS still operates blind. It distributes according to static configuration regardless of what's actually happening. A server might be overwhelmed or offline—DNS continues sending its allocated share.
Health-Check Failover
Health-check DNS transforms static configuration into basic monitoring. The DNS provider probes your servers—typically every 10-30 seconds—and removes unhealthy ones from responses automatically.
When a server fails consecutive checks, it disappears from DNS responses. When it recovers, it returns. New clients receive only working addresses. No manual intervention.
Implementations range from basic TCP connectivity checks to HTTP response validation, content verification, and custom health scripts. Geographic failover extends this: maintain primary servers in one region, backups in another. If European servers fail, traffic shifts to American servers automatically.
Yet the fundamental constraint remains. When a server fails, clients with cached responses keep trying to connect. They fail for the duration of their cached TTL. Health checking detects failures in seconds; clients experience them for minutes.
DNS vs. Dedicated Load Balancers
The difference is architectural, not just a matter of features.
Dedicated load balancers—hardware like F5 BIG-IP, software like HAProxy or NGINX—sit between clients and servers. They receive every connection and decide in real-time which backend handles it. They see actual traffic, know actual load, detect failures instantly, reroute immediately.
DNS operates one layer removed. It influences where clients go but never sees the traffic itself. It points; it doesn't direct.
DNS load balancing:
- Zero additional infrastructure
- Infinite scale (DNS handles billions of queries daily)
- No single point of failure beyond your DNS provider
- No added latency
- Coarse-grained distribution on TTL timescales
- Blind to actual server load
- No session awareness
- Failover in minutes, not milliseconds
Dedicated load balancers:
- Additional infrastructure to deploy and maintain
- Becomes a critical path component
- Real-time load awareness
- Session persistence for stateful applications
- Sub-second failover
- SSL termination, header manipulation, URL routing
- Traffic inspection and security capabilities
Neither is universally better. They solve the same problem through fundamentally different mechanisms.
When DNS Load Balancing Works
Geographic distribution. DNS excels here. Geolocation-aware routing sends European clients to European servers, Asian clients to Asian servers. The DNS query itself carries location information.
Stateless services. If your application maintains no server-side state, DNS's inability to guarantee consistent server selection becomes irrelevant. Static content, stateless APIs, read-heavy workloads distribute effectively.
Budget constraints. DNS load balancing costs nothing beyond existing DNS service. Basic redundancy without infrastructure investment.
First-tier distribution. Use DNS to spread traffic across regions, then deploy dedicated load balancers within each location. Global scale from DNS, fine-grained control locally.
When DNS Load Balancing Fails
Session affinity requirements. If users must stay on specific servers—shopping carts, authenticated sessions, websocket connections—DNS's unpredictable routing breaks the experience.
Sub-second failover needs. Even with 60-second TTLs and instant health checking, failures affect clients for a full minute. Mission-critical systems need dedicated load balancers.
Bursty traffic patterns. DNS distributes based on query volume, not request volume. Bursts overwhelm some servers while others idle.
Advanced traffic management. SSL termination, header manipulation, URL routing, Web Application Firewalls—none exist at the DNS layer. DNS resolves names to addresses. That's all it does.
The Honest Assessment
DNS load balancing returns multiple IP addresses and lets clients pick one. Simple, free, infinitely scalable. Also crude, blind to actual conditions, and fails on TTL timescales rather than real-time.
The caching that makes the Internet fast limits DNS load balancing's responsiveness. You can't escape this. You can only choose TTL values that balance distribution granularity against lookup overhead.
For geographic distribution and stateless services, DNS load balancing is often exactly right. For session-dependent applications and mission-critical failover, it's fundamentally inadequate. Most production environments use it as one layer—global distribution via DNS, local intelligence via dedicated load balancers.
DNS is a name resolution system pressed into service as a traffic director. It works like a hack—elegantly, within limits, and with tradeoffs that emerge directly from the repurposing.
Frequently Asked Questions About DNS Load Balancing
Was this page helpful?