HTTP/2 Multiplexing

Updated 9 hours ago

HTTP/1.1 forced your browser to pretend to be six different clients, opening six separate connections to the same server, just to load a page at reasonable speed. HTTP/2 multiplexing finally let browsers be honest about what they're doing: one connection, many simultaneous requests, no waiting in line.

The Absurdity HTTP/2 Fixed

HTTP/1.1 processes requests sequentially. When your browser requests a resource, it must wait for the complete response before sending the next request on that connection. This is head-of-line blocking—a slow response holds up everything waiting behind it.

Browsers worked around this by opening multiple connections to each domain, typically six. This created parallelism through brute force, but with serious costs. Each TCP connection requires its own handshake, burning time and resources. Each maintains separate congestion control, preventing optimal bandwidth use. Each starts slow due to TCP slow start, ramping up over time rather than transmitting at full speed.

Web pages with dozens or hundreds of resources suffered. Even with six connections, resources queued up waiting for an opening. Developers resorted to increasingly absurd hacks:

Domain sharding spread resources across multiple fake domains—images.example.com, scripts.example.com, styles.example.com—just to trick browsers into opening more connections. The protocol was so broken that lying to it was best practice.

Resource concatenation bundled many small files into large ones, reducing request count at the cost of cache efficiency and flexibility. Need to fix one line of CSS? Invalidate the entire bundle.

These weren't elegant solutions. They were the web fighting its own protocol.

How Multiplexing Actually Works

HTTP/2 treats a connection not as a pipe carrying one thing at a time, but as a channel carrying many things simultaneously.

The key abstraction is the stream. Each request-response pair gets a stream, identified by a number. Multiple streams share one connection. Communication happens through frames—small chunks of data tagged with their stream ID.

When your browser needs ten resources, it sends all ten requests immediately. The server processes them concurrently and sends response frames as they're ready. Frames interleave—a chunk from stream 1, then stream 3, then more from stream 1, then stream 7. The browser reassembles each stream from its frames.

From the application's perspective, it still looks like traditional request-response pairs. But underneath, everything flows through one multiplexed connection.

What This Actually Buys You

No more waiting in line. Resources don't queue behind each other. All requests go out immediately; responses arrive as fast as the server can generate them.

One connection does the work of six. Fewer handshakes, less memory, reduced CPU on both ends. The server doesn't juggle hundreds of connections from the same browser.

Better bandwidth utilization. All streams share one TCP congestion window. Instead of six connections each ramping up from slow start, one connection maintains optimal throughput.

The hacks become unnecessary. Domain sharding actively hurts HTTP/2—it prevents connection reuse. Concatenation loses its benefit and keeps its costs. Many small files work fine now.

Priorities become possible. Browsers can tell servers which resources matter most. HTML and CSS before analytics scripts. Critical content before nice-to-haves.

The Binary Layer Underneath

HTTP/1.1 was text-based. You could read it with your eyes:

GET /index.html HTTP/1.1
Host: example.com

HTTP/2 is binary. Every message is a frame with a precise structure: a 9-byte header specifying length, type, flags, and stream ID, followed by payload.

Common frame types:

DATA: carries request or response body
HEADERS: carries HTTP headers (compressed)
SETTINGS: communicates configuration
RST_STREAM: terminates a stream mid-flight
PING: measures round-trip time

Binary protocols parse faster and have less ambiguity than text protocols. The tradeoff is you can't debug by staring at raw bytes. But browser dev tools and Wireshark handle HTTP/2 fine—this rarely matters in practice.

Flow Control Prevents Chaos

Multiplexing multiple streams over one connection creates a new problem: what stops a fast stream from starving everything else?

HTTP/2 implements flow control at two levels. Connection-level flow control governs total throughput. Stream-level flow control manages each stream independently.

Receivers advertise window sizes—how much data they're willing to accept. Senders must stay within these windows. As receivers process data, they send window updates allowing more to flow.

This prevents a fast server from overwhelming a slow client, ensures fair sharing among streams, and lets receivers prioritize which streams get bandwidth.

Stream Prioritization (In Theory)

HTTP/2 lets clients assign priorities to streams. Streams can depend on other streams—CSS depends on HTML, indicating it should wait. Streams can have weights indicating relative importance.

In practice, this was underutilized. Browsers implemented it differently. Many servers ignored priorities entirely. The mechanisms were complex to use well. HTTP/3 replaced this with a simpler priority scheme.

Server Push (The Road Not Taken)

HTTP/2 introduced server push: servers could send resources before browsers requested them. When sending HTML, the server might push the CSS and JavaScript it knows the browser will need, eliminating a round-trip.

It saw limited adoption. The core problem: servers don't know what's already cached. Pushing cached resources wastes bandwidth. Proposed solutions like cache digests never gained traction. Most sites now use preload hints instead—telling browsers about important resources without pushing them.

Connection Coalescing

HTTP/2 can reuse one connection for multiple domains if they resolve to the same IP and present a valid certificate covering all domains.

If example.com and cdn.example.com both point to the same server with a wildcard certificate, your browser might use one connection for both. Further reducing overhead.

This requires careful certificate configuration and identical IPs. Some browsers are conservative about when they'll coalesce.

The Limitation HTTP/3 Fixes

HTTP/2 eliminated HTTP-level head-of-line blocking. But TCP-level blocking remains.

TCP guarantees ordered delivery. If a packet is lost, TCP waits for retransmission before delivering subsequent packets—even if those packets belong to different HTTP/2 streams that don't care about each other.

One lost packet stalls all streams. The multiplexing that eliminated one form of head-of-line blocking runs into another at the transport layer.

HTTP/3 addresses this by replacing TCP with QUIC, which handles streams independently at the transport level. A lost packet only affects its stream.

Migrating From HTTP/1.1

Undo the hacks. Domain sharding hurts HTTP/2 by preventing connection reuse. Resource concatenation prevents granular caching. Many small files are fine now.

Verify your infrastructure. Some proxies, firewalls, and corporate networks interfere with HTTP/2. Ensure graceful fallback to HTTP/1.1.

Measure the results. Not all sites benefit equally. High-latency connections see the biggest gains—30-50% latency reduction is common. Low-latency local connections see less improvement.

HTTP/2 Requires HTTPS

HTTP/2 was designed for both HTTP and HTTPS, but browsers only implement it over HTTPS. This was intentional—it pushed HTTPS adoption forward.

During the TLS handshake, client and server use ALPN (Application-Layer Protocol Negotiation) to agree on HTTP/2 if both support it.

Frequently Asked Questions About HTTP/2 Multiplexing

Was this page helpful?

😔

🤨

😃