Updated 10 hours ago
Latency is the gap between when you act and when something happens.
Click a link. In that fraction of a second before the page starts loading, your request is racing through cables, bouncing between routers, crossing oceans. Latency is how long that race takes.
It's not about how much data you can send. It's about how long you have to wait.
Why Milliseconds Matter
Humans notice delays of 100 milliseconds. At 300 milliseconds, something feels "off." At one second, the illusion of immediacy breaks completely—you become aware that you're waiting.
This isn't just about perception. Amazon found that every 100ms of latency cost them 1% in sales. Google found that an extra 500ms in search page load time dropped traffic by 20%. The correlation between latency and user behavior is brutally direct.
For real-time applications, the stakes are even higher. A video call with 500ms of latency turns into an exercise in awkward interruption. An online game with high latency makes you feel like you're controlling a puppet through molasses. Latency isn't a technical detail—it's the difference between an experience that feels alive and one that feels broken.
Where Latency Comes From
Latency stacks. Multiple delays accumulate into the total time you experience:
The speed of light. This is the floor you cannot break through. Light travels at about 200,000 kilometers per second through fiber optic cable. New York to London is 5,500 kilometers of cable. Do the math: 28 milliseconds minimum, just for light to make the crossing. That's a physical law. You can't optimize it away. You can only move closer.
Every hop along the path. Your data doesn't travel in a straight line. It bounces through routers, switches, firewalls—each one examining the packet, deciding where to send it next, adding its own small delay. A typical Internet path might have 15-20 hops. Each one adds time.
Waiting in line. When a router receives more traffic than it can immediately forward, packets queue up. Under heavy load, this queuing delay can dwarf everything else. It's also the most variable component—latency that was 20ms in the morning might be 200ms during peak hours.
Transmission time. Pushing bits onto a wire takes time. On modern fiber, this is negligible for small packets. On a congested mobile connection, it starts to matter.
Measuring the Gap
The standard measurement is Round-Trip Time (RTT): how long for a packet to reach a destination and return. When you run ping google.com and see time=15ms, that's RTT. The one-way delay is roughly half, though the forward and return paths aren't always identical.
For web applications, "time to first byte" (TTFB) matters more—how long from sending a request until the first byte of response arrives. This captures network latency plus server processing time. It's the moment of truth: the point where waiting ends and receiving begins.
The Geography of Delay
Latency varies enormously based on where you are and what you're connecting to:
Same data center: Less than 1ms. Fast enough that you can almost pretend the network doesn't exist.
Same city: 10-20ms. Still feels instant to humans.
Cross-country: 40-80ms. Starting to be perceptible in interactive applications.
Cross-ocean: 100-150ms. The speed of light becomes the dominant factor. You cannot make New York to Tokyo faster than physics allows.
Satellite Internet: 500-700ms. Geostationary satellites orbit 36,000 kilometers up. The round trip to space and back creates unavoidable half-second delays. This is why satellite users struggle with video calls.
Mobile networks: 4G adds 30-50ms of latency from radio transmission and cellular architecture. 5G can achieve 10-20ms under ideal conditions—a genuine leap forward.
Latency vs. Bandwidth: Different Problems
Bandwidth is capacity—how many bytes per second can flow through a connection. Latency is delay—how long each byte takes to arrive.
High bandwidth with high latency is like a massive cargo ship: you can send enormous amounts, but it takes forever to arrive. Low latency with low bandwidth is like a motorcycle courier: fast delivery, but only for small packages.
For most modern web applications, latency dominates. Loading a page requires hundreds of small requests. If each request takes 200ms round-trip, the page takes seconds to become usable—even on a gigabit connection. You're not bottlenecked on how much you can send. You're bottlenecked on how long each send-receive cycle takes.
The Latency Tax
Every network request pays a tax measured in round-trip time. This tax is fixed per request, regardless of how much data you're transferring.
If your latency is 100ms and you need to make 10 sequential requests to load a page, you've spent one full second on latency alone—before a single byte of content arrives. This is the latency tax: unavoidable delay that accumulates with every round trip.
Modern web performance engineering is largely about reducing how often you pay this tax:
- Bundling combines many small files into fewer large ones—fewer requests, fewer taxes paid.
- HTTP/2 and HTTP/3 allow multiple requests over a single connection, so you're not waiting for each one to complete before starting the next.
- Server push sends resources before they're requested, eliminating round trips entirely.
- Caching stores data closer to users, so requests never have to cross the ocean.
- CDNs distribute content to servers worldwide, shrinking the physical distance between users and data.
What You Can Control
You can't change the speed of light. But you can:
Move closer. Geographic distribution—putting servers where your users are—is the most powerful latency reduction. A CDN can turn a 150ms cross-ocean request into a 20ms local one.
Make fewer trips. Every architectural decision that reduces round trips pays dividends. Batch requests. Use persistent connections. Cache aggressively.
Choose faster paths. Premium network providers often have more direct routes than budget options. The difference between 12 hops and 20 hops matters.
Optimize protocols. Modern protocols like QUIC reduce connection setup time and handle packet loss more gracefully than older alternatives.
Latency is the delay where frustration lives. Understanding it—really understanding where those milliseconds go—is the first step toward building experiences that feel instant instead of sluggish.
Frequently Asked Questions About Latency
Was this page helpful?