1. Library
  2. Http and the Web
  3. Modern Protocols

Updated 10 hours ago

HTTP has a fundamental mismatch with human conversation: you can only speak when spoken to.

In HTTP, the client asks and the server answers. Then silence. If the server has something new to say—a chat message arrived, a stock price changed, another player moved—it has to wait. It cannot reach out. It can only respond when asked.

This creates the real-time communication problem.

The Workarounds (And Why They Fail)

Polling is the brute-force solution: ask every few seconds if anything changed. "Any new messages?" "No." "Any new messages?" "No." "Any new messages?" "Yes, here's one." This wastes bandwidth, burns server resources, and introduces latency—you only learn about updates at the polling interval.

Long polling is cleverer. The server holds your request hostage until something happens, then responds. You immediately send another request. There's always one request waiting. But you're still cycling through request-response pairs, creating overhead for every message.

Server-Sent Events let servers push to clients—but only servers. If the client needs to send something, it's back to regular HTTP requests. One-way streets in a world that needs conversation.

WebSockets solve this by establishing a persistent, bidirectional connection. Either side can send messages whenever they want. No asking permission. No waiting your turn.

The Handshake: HTTP Bootstraps Its Own Replacement

WebSockets pull off an elegant trick: they use HTTP to create something that isn't HTTP.

The client sends what looks like a normal HTTP request, but with special headers:

GET /chat HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

The server responds:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

"101 Switching Protocols"—the connection transforms mid-stream. The same TCP connection that started speaking HTTP now speaks WebSocket. The HTTP conversation is over; a new protocol takes its place.

This is pragmatic genius. WebSockets ride existing HTTP infrastructure through firewalls and proxies, then shed that skin once inside.

The Wire Protocol

WebSocket messages travel as frames—small headers (2-14 bytes) followed by payload.

Text frames carry UTF-8 strings. JSON lives here.

Binary frames carry raw bytes. Images, audio, binary protocols.

Control frames manage the connection itself. Ping asks "are you alive?" Pong answers "yes." Close initiates graceful shutdown.

Masking is a security requirement for client-to-server messages. The payload is XORed with a random mask to prevent cache poisoning attacks. Server-to-client messages skip this—servers are trusted.

Large messages can fragment across multiple frames, preventing one massive transfer from blocking smaller urgent messages.

What WebSockets Enable

Chat is the canonical example. Messages appear instantly because the server pushes them the moment they arrive—no polling delay, no wasted requests.

Collaborative editing in Google Docs or Figma. Every keystroke, every cursor movement, synchronized across users in real-time.

Gaming requires low-latency bidirection. Your actions reach the server fast; other players' movements reach you fast.

Live dashboards show metrics, status, and alerts as they happen. The data pushes to you.

Trading platforms where milliseconds matter. Price updates stream continuously; orders confirm instantly.

Notifications delivered the moment they're generated, not whenever you next poll.

Client-Side API

JavaScript's WebSocket API is refreshingly simple:

const socket = new WebSocket('wss://example.com/socket');

socket.addEventListener('open', function(event) {
    console.log('Connected');
    socket.send('Hello Server!');
});

socket.addEventListener('message', function(event) {
    console.log('Received:', event.data);
});

socket.addEventListener('close', function(event) {
    console.log('Disconnected');
});

socket.addEventListener('error', function(event) {
    console.error('Error:', event);
});

// Send anytime after open
socket.send(JSON.stringify({ type: 'chat', text: 'Hello!' }));

// Close when done
socket.close();

Four events. One send method. One close method. That's the entire surface area.

Server-Side Reality

Every major language has WebSocket libraries:

  • Node.js: ws (minimal), socket.io (batteries-included), uWebSockets (performance)
  • Python: websockets, aiohttp, Django Channels
  • Go: gorilla/websocket
  • Java: javax.websocket, Spring WebSocket
  • Ruby: ActionCable

Servers handle the handshake, track connected clients, route messages, and clean up disconnections.

Message Patterns

Broadcast: send to everyone. A chat message goes to all room members.

Targeted: send to one. A private message reaches only its recipient.

Rooms/Channels: group clients logically. Broadcast to "room-123" instead of everyone.

Request-Response: layer RPC on top. Include a message ID; match responses by ID.

Pub/Sub: clients subscribe to topics, receive only what they asked for.

The Scaling Challenge

WebSockets are stateful. Each connection holds memory, consumes a file descriptor, maintains context. This breaks the beautiful statelessness of HTTP that made scaling easy.

If user A connects to server 1 and user B connects to server 2, how does A's message reach B?

Message brokers solve this. Redis Pub/Sub, RabbitMQ, or Kafka sit between servers. Server 1 publishes; server 2 subscribes and delivers. Every server talks to every other server through the broker.

Sticky sessions keep users on the same server, simplifying routing but creating hotspots.

Connection limits matter. Know how many concurrent connections your infrastructure handles. Plan accordingly.

Security

Authentication happens at handshake (cookies work) but authorization must happen on every message. Clients lie.

Input validation is non-negotiable. Every message is untrusted input.

Rate limiting prevents spam and abuse.

Message size limits prevent memory exhaustion.

WSS (WebSocket Secure) is mandatory for production. Same TLS that protects HTTPS.

Origin verification during handshake prevents cross-site WebSocket hijacking.

When Connections Break

They will break. Networks fail. Servers restart. Mobile devices sleep.

Heartbeats detect dead connections. No pong response? Connection is gone.

Auto-reconnect with exponential backoff. Don't hammer a struggling server.

Message IDs or sequence numbers detect gaps. Resend what was lost.

Resume protocols let clients catch up on missed messages after brief disconnections.

When Not to Use WebSockets

Server-Sent Events handle unidirectional server-to-client streaming more simply. If clients only receive, consider SSE.

Regular HTTP works fine for request-response patterns. Don't add WebSocket complexity for operations that don't need real-time.

WebTransport (HTTP/3-based) is the future for new applications needing WebSocket-like capabilities with better performance.

WebSockets shine when both sides need to talk, when latency matters, when real-time isn't optional. They're the protocol that finally lets servers start conversations.

Frequently Asked Questions About WebSockets

Was this page helpful?

😔
🤨
😃
WebSockets • Library • Connected