1. Library
  2. TCP and UDP
  3. TCP Deep Dive

Updated 1 day ago

Every packet you send might get lost. It might arrive corrupted, out of order, twice, or never at all. The Internet makes no promises.

TCP makes promises anyway.

When you load a webpage, send an email, or download a file, TCP guarantees your data arrives intact, in order, and complete—even across networks that guarantee nothing. How do you build reliability from unreliability? Through verification.

The Three-Way Handshake: Proving You Can Hear Each Other

Before TCP sends a single byte of data, it solves a fundamental problem: how do two machines know they can actually communicate?

The Internet is not a phone call. There's no dial tone. When you send a packet, you have no idea if it arrived. The other machine has no idea you exist until your packet shows up—if it shows up. So TCP begins with a ritual of mutual proof.

The client sends a SYN (synchronize) packet: "I exist. I want to talk. Here's a random number I'll use to track our conversation."

The server responds with SYN-ACK: "I heard you. I exist too. Here's my random number. I incremented yours by one to prove I actually read your packet."

The client sends ACK: "I heard you hear me. Your number plus one. We're synchronized."

Three packets. Two strangers across an unreliable void have proven they can hear each other. Now they can talk.

The random starting numbers (Initial Sequence Numbers) prevent a subtle disaster: packets from old, dead connections being mistaken for new data. If you always started at zero, a delayed packet from a previous connection might look valid. Random numbers make that collision astronomically unlikely.

Data Transfer: Numbering Every Byte

Once the handshake completes, TCP faces its core challenge: sending data reliably over unreliable networks.

The solution is obsessive bookkeeping. TCP assigns a sequence number to every byte—not every packet, every byte. If you send a 1000-byte segment starting at sequence number 5000, the bytes are numbered 5000 through 5999. The next segment starts at 6000.

Why bytes instead of packets? Packets can be any size, fragmented, combined, retransmitted in different chunks. By numbering bytes, TCP can reassemble data correctly regardless of how it arrives. If bytes 5000-5999 and 7000-7999 arrive but 6000-6999 is missing, the receiver knows exactly what's absent.

The receiver acknowledges data with ACK packets: "I've received everything up to byte N. Send me byte N next." This is cumulative acknowledgment—a single ACK confirms all prior data.

  • Sender: "Here's bytes 5000-5999"
  • Sender: "Here's bytes 6000-6999"
  • Sender: "Here's bytes 7000-7999"
  • Receiver: "Got everything up to 8000"

The sender doesn't wait for each ACK before sending more. It keeps a window of unacknowledged data in flight, maximizing throughput while tracking every byte, ready to resend anything that doesn't get acknowledged.

When Things Go Wrong: Retransmission

Packets vanish. It's not a question of if, but when. TCP's retransmission mechanisms turn this certainty into a solvable problem.

Every segment carries a checksum—a mathematical fingerprint of its contents. The receiver recalculates this fingerprint. If it doesn't match, the packet was corrupted in transit. TCP silently drops it. No acknowledgment. To the sender, silence means "try again."

The sender maintains a timer for each unacknowledged segment. If the timer expires before an ACK arrives, TCP retransmits. The timeout isn't fixed—TCP continuously measures round-trip time and adjusts. A connection across the room needs a shorter timeout than a connection across the ocean.

But waiting for timeouts is slow. TCP has a faster mechanism: when the receiver gets data out of order, something is wrong. If bytes 5000-5999 arrive, then 7000-7999, then 8000-8999—where's 6000-6999?

The receiver can't acknowledge past the gap, so it keeps sending "I need 6000" over and over. When the sender sees three duplicate ACKs asking for the same byte, it doesn't wait for the timeout. It immediately retransmits the missing segment. This "fast retransmit" recovers lost packets in milliseconds rather than seconds.

Selective Acknowledgment (SACK) extends this further. Instead of just saying "I need 6000," the receiver can say "I need 6000, but I already have 7000-8999." The sender knows exactly what's missing and retransmits only that.

Flow Control: Don't Overwhelm the Receiver

Reliability isn't just about the network. What if the sender transmits faster than the receiver can process?

The receiver might be a phone with limited memory, a server handling thousands of connections, or a machine whose application is busy doing other work. If data arrives faster than it can be consumed, the receive buffer fills up. New packets get dropped. Retransmissions get dropped again. The connection grinds to a halt.

TCP solves this with the receive window—a number the receiver includes in every ACK: "I have room for N more bytes. Don't send more than that."

The window reflects actual buffer space. As the application reads data, the window grows. If the application falls behind, the window shrinks. If the buffer fills completely, the window drops to zero: "Stop. I can't handle any more."

The sender respects this limit absolutely. When the window is zero, it stops sending data but periodically sends tiny "window probe" packets—just checking if space has freed up. Without these probes, a lost window update could deadlock the connection forever.

Congestion Control: Don't Overwhelm the Network

Flow control protects the receiver. Congestion control protects everyone.

The network between sender and receiver has limited capacity. Routers can only forward so many packets per second. If senders collectively transmit more than the network can handle, packets queue up in router buffers. Eventually buffers overflow. Packets drop. Retransmissions make congestion worse. The network collapses.

TCP can't see the network's capacity directly. It infers it from behavior. The key insight: packet loss means congestion. If your packets are being dropped, you're sending too fast.

The sender maintains a congestion window—its estimate of how much data the network can handle. The actual sending limit is the smaller of the congestion window and the receive window.

New connections start conservatively. The congestion window begins small—one or two segments. As ACKs arrive successfully, the window grows exponentially ("slow start"). One segment becomes two, two becomes four, four becomes eight. This sounds aggressive, but it's cautious: you're probing the network's capacity from zero.

Once the window reaches a threshold (or packet loss occurs), TCP switches to "congestion avoidance"—linear growth. Add one segment per round-trip time. Careful, measured increase.

When loss occurs, TCP backs off hard. A timeout resets the window to minimum and starts slow start over. Three duplicate ACKs cut the window in half but skip slow start—"fast recovery."

This dance—probe upward, back off on loss, probe again—finds the network's capacity and tracks it as conditions change. Modern algorithms like CUBIC and BBR use more sophisticated models, but the core principle remains: packet loss means "slow down."

Connection Termination: Saying Goodbye Gracefully

TCP connections end as deliberately as they begin. Either side can initiate shutdown by sending a FIN (finish) packet: "I have no more data to send."

The other side acknowledges the FIN but might not be done sending. TCP supports half-closed connections—one direction finished, the other still active. You've sent your HTTP request, but you're still waiting for the response.

When both sides have sent and acknowledged FINs, the connection is fully closed. But there's a final safeguard: the side that sent the last ACK enters TIME_WAIT state for several minutes. That final ACK might have been lost. If the other side retransmits its FIN, someone needs to be there to acknowledge it.

For emergencies, TCP supports abrupt termination via RST (reset) packets. No negotiation, no waiting—the connection is destroyed immediately.

The Trust Protocol

TCP doesn't trust the network. It trusts the protocol.

Every mechanism serves verification. The handshake proves mutual reachability. Sequence numbers prove ordering. Acknowledgments prove receipt. Checksums prove integrity. Retransmission proves persistence. Flow control proves the receiver is ready. Congestion control proves the network can handle the load.

This is why TCP powers most of the Internet. Not because it's fast—UDP is faster. Not because it's simple—UDP is simpler. Because when you need to know that your data arrived, intact, in order, without overwhelming anyone in the chain, TCP provides mathematical certainty where the network provides none.

The complexity is the point. Reliability doesn't come from wishing. It comes from verification, acknowledgment, and the willingness to try again when things go wrong.

Frequently Asked Questions About TCP

Was this page helpful?

😔
🤨
😃