Updated 9 hours ago
When thousands of devices transmit data across shared infrastructure, how does the Internet avoid choking on its own traffic? TCP congestion control is the answer—a mechanism that prevents network collapse by making each sender regulate itself based on signals the network never explicitly sends.
The Problem: Inference Under Uncertainty
Here's the fundamental challenge: a TCP sender has no direct knowledge of network conditions. No router broadcasts "I'm overloaded." No central authority coordinates traffic. The sender is blind, connected to the receiver through a path it cannot see, sharing infrastructure with millions of other connections it knows nothing about.
Yet TCP must somehow avoid overwhelming that shared infrastructure. Send too fast, and routers drop packets, latency spikes, and everyone suffers. Send too slowly, and you waste available bandwidth.
The network's only vocabulary is silence. Packets either arrive and get acknowledged, or they vanish. TCP learned to read that silence—to infer congestion from the absence of expected responses, from the timing of what comes back, from patterns in the acknowledgments.
This is fundamentally different from flow control, where the receiver explicitly says "I have X bytes of buffer space." Congestion control has no such luxury. It's educated guessing, refined over decades into something remarkably effective.
Why This Matters: Congestion Collapse
Without congestion control, a phenomenon called congestion collapse can destroy a network. Here's how it happens:
- Traffic increases until routers start dropping packets
- Senders detect the loss and retransmit
- Retransmissions add more traffic to the already overloaded network
- More packets get dropped
- More retransmissions flood the network
- Useful throughput approaches zero while the network is saturated with futile retransmissions
This is the tragedy of the commons playing out in microseconds. Every sender, trying to deliver their own packets, makes the problem worse for everyone. The network can reach a state where it's completely full yet accomplishing almost nothing.
TCP congestion control prevents this by making each sender back off precisely when instinct would say to push harder.
The Congestion Window: A Self-Imposed Speed Limit
At the heart of congestion control is the congestion window (cwnd)—a sender-side limit on unacknowledged data in flight. The receiver has its own limit (the receive window, rwnd) advertised through flow control. TCP respects whichever is smaller:
The congestion window isn't told to TCP by anyone. The sender maintains it independently, adjusting based on what it infers about network conditions. When things seem good, cwnd grows. When congestion appears, cwnd shrinks.
This self-regulation is what makes the Internet work at scale. No central coordinator. No explicit congestion signals. Just millions of TCP connections independently adjusting their own speed limits based on indirect evidence.
Slow Start: Exponential Discovery
When a connection begins, the sender knows nothing. How much bandwidth is available? Is the path congested? The network won't say.
Slow start is TCP's solution: begin cautiously, then accelerate rapidly. Despite the name, it's actually exponential growth:
- Start with a small window (typically 1-10 segments)
- For each acknowledgment received, increase cwnd by one segment
- Since you send cwnd segments per round-trip, the window doubles every RTT
The progression: 1 → 2 → 4 → 8 → 16 → 32...
This continues until something stops it:
- Packet loss detected — the network pushed back
- Slow start threshold (ssthresh) reached — a remembered limit from past congestion
- Receiver's window reached — flow control takes over
The slow start threshold is TCP's memory of trouble. When congestion is detected, TCP records roughly where it happened. Next time, it switches to gentler growth before reaching that point.
Congestion Avoidance: Linear Caution
Once cwnd reaches ssthresh, TCP knows it's in the danger zone—close to where congestion occurred before. The strategy shifts from exponential growth to linear: increase by just one segment per RTT, regardless of how many acknowledgments arrive.
This creates TCP's characteristic sawtooth pattern:
- Window grows linearly, probing for more bandwidth
- Congestion detected (packet loss)
- Window drops sharply
- Slow start until ssthresh
- Linear growth resumes
- Repeat
The sawtooth isn't a bug—it's TCP continuously discovering the network's current capacity. Networks change. New connections start, others finish. The optimal sending rate is a moving target, and TCP tracks it by periodically overshooting, detecting the overshoot, and backing off.
Reading the Silence: How TCP Detects Congestion
Since the network won't announce congestion, TCP watches for symptoms:
Packet Loss
The traditional signal. Two varieties:
Triple duplicate ACKs: The receiver got packets 1, 2, 4, 5, 6 and keeps acknowledging packet 2, asking for packet 3. Three identical ACKs suggest one segment was lost while others got through—mild congestion. Response: cut cwnd roughly in half, continue transmitting.
Timeout: Nothing came back at all. Either many packets were lost, or even the ACKs were lost. This suggests severe congestion. Response: reset cwnd to minimum, restart slow start from scratch.
Explicit Congestion Notification (ECN)
Modern networks can do better than silence. With ECN, routers mark packets when queues are building—before dropping anything. The receiver echoes these marks back to the sender.
This is the network finally learning to speak. Instead of "silence means trouble," ECN provides "yellow light" warnings before the red light of packet loss. TCP can reduce its rate proactively, achieving higher throughput with lower latency.
The Evolution of Algorithms
The basic framework—window-based, loss-triggered—has remained stable. But different algorithms implement different strategies within that framework:
TCP Reno (The Classic)
Additive Increase, Multiplicative Decrease (AIMD). Grow by one segment per RTT. On loss, halve the window. Simple, stable, fair. But slow to utilize high-bandwidth, high-latency paths—it takes forever to ramp up on a satellite link.
TCP Cubic (The Linux Default)
Cubic uses a cubic function instead of linear increase. After congestion, it grows slowly at first, then more aggressively as time passes without further trouble. The curve is centered on the window size where congestion last occurred—Cubic quickly returns to what worked before, then cautiously probes beyond.
BBR (The Paradigm Shift)
Google's BBR abandons loss as the primary signal. Instead, it actively measures two things:
- Bottleneck bandwidth: How fast can data actually flow?
- Minimum RTT: What's the base latency without queue buildup?
BBR cycles between probing for more bandwidth and draining any queues that built up. The insight: packet loss means you've already congested the network. Why wait for damage? BBR tries to fill the pipe without overflowing it.
On paths with random packet loss (wireless networks, for instance), BBR dramatically outperforms loss-based algorithms that mistake noise for congestion.
The Distinction That Matters
Flow control answers: Can the receiver handle this data? The receiver explicitly advertises its buffer space. End-to-end between sender and receiver.
Congestion control answers: Can the network handle this data? The sender infers capacity from indirect signals. Considers the entire path and all its shared infrastructure.
Both limits apply simultaneously. A sender with a clear network path still can't exceed the receiver's window. A receiver with massive buffers still gets data slowly if the network is congested. TCP respects whichever constraint is tighter.
Frequently Asked Questions About TCP Congestion Control
Was this page helpful?