1. Ports
  2. Port 5060

Port 5060 is where conversations begin. Not the words themselves, but the invitation to speak. When you place a call on your smartphone, when you join a video conference, when someone halfway around the world picks up their phone and hears your voice, the conversation started here, with a text message that reads like a polite request: "INVITE."

The Session Initiation Protocol doesn't carry your voice. It carries the handshake. The "hello, are you there?" The "yes, I'm listening." The "goodbye." SIP is the protocol that taught the Internet how to ask someone if they want to talk.

What SIP Does

SIP is a signaling protocol. It creates, modifies, and terminates communication sessions between participants. When you call someone, SIP handles everything except the actual audio and video: finding the other person, negotiating how you'll communicate, setting up the connection, and cleanly ending it when one of you hangs up.

The actual voice travels over RTP (Real-time Transport Protocol). SIP just makes the introduction.

Think of it like this: SIP is the maitre d' at a restaurant. It seats you at the table, makes sure both parties are ready, and steps away once the conversation begins. The conversation itself happens without SIP's involvement, but SIP made it possible.

How SIP Works

SIP is a text-based protocol, deliberately modeled on HTTP.1 You could read a SIP message and understand what it's doing. This was intentional. The designers wanted the protocol to feel like the web, not like the telephone network.

A basic call works like this:

1. INVITE — Alice's phone sends an INVITE to Bob's phone. This message contains Alice's address, what kind of media she wants to use (audio, video), and the codecs she supports.

INVITE sip:bob@biloxi.example.com SIP/2.0
Via: SIP/2.0/UDP client.atlanta.example.com:5060
From: Alice <sip:alice@atlanta.example.com>;tag=9fxced76sl
To: Bob <sip:bob@biloxi.example.com>
Call-ID: 3848276298220188511@atlanta.example.com

2. 100 Trying — The network acknowledges it received the request and is working on it.

3. 180 Ringing — Bob's phone starts ringing. Alice hears a ringback tone.

4. 200 OK — Bob answers. His phone sends back acceptance of the call along with his own media capabilities.

5. ACK — Alice's phone acknowledges the 200 OK. The signaling handshake is complete.

6. Media flows — Now RTP takes over, carrying the actual voice between Alice and Bob. SIP's job is done until someone hangs up.

7. BYE — When Bob hangs up, his phone sends a BYE request. Alice's phone responds with 200 OK. The call is terminated.

The entire exchange reads like a conversation about having a conversation. INVITE. Are you there? 200 OK. Yes, let's talk. BYE. I'm done. 200 OK. Understood.

The Story Behind SIP

In the mid-1990s, the telecommunications world faced a choice. Video conferencing and Internet telephony were emerging, and two camps formed around how to make them work.

The International Telecommunication Union (ITU), the organization that had governed telephone networks since before Alexander Graham Bell, developed H.323. It was based on ISDN signaling, wrapped in IP. It was complex, binary, and felt like the telephone network translated into data packets.2

Meanwhile, at Columbia University, a researcher named Henning Schulzrinne had a different vision. Along with Mark Handley at ACIRI, Eve Schooler at CalTech, and Jonathan Rosenberg at Bell Labs, they designed SIP starting in 1996.3 They wanted a protocol that felt like the Internet, not like the telephone company.

SIP was intentionally simple. Where H.323's specifications ran to 736 pages, SIP and its related protocols totaled 128.4 Where H.323 used binary encoding, SIP used plain text you could read in a packet capture. Where H.323 assumed a world of telephone switches and gatekeepers, SIP assumed a world of URIs and web servers.

The first version was published as RFC 2543 in March 1999.5 The current version, SIP 2.0, was standardized as RFC 3261 in June 2002.6

SIP won. Not because it was more powerful, but because it was more adaptable. Developers could read it, debug it, extend it. It could run on the same infrastructure as web services. It could evolve with the Internet instead of fighting against it.

In November 2000, the 3GPP (the organization that defines mobile network standards) adopted SIP as the signaling protocol for the IP Multimedia Subsystem (IMS).7 This decision meant that SIP would become the protocol for mobile voice calls worldwide. Every VoLTE call on your smartphone, every 5G voice call, uses SIP.

Henning Schulzrinne went on to become CTO of the FCC from 2011 to 2014.8 The protocols he helped create now carry billions of calls daily.

The Protocol's Elegance

SIP has six mandatory headers in every message: From, To, Contact, CSeq, Call-ID, and Via.9

Call-ID uniquely identifies a conversation. Every message in a call, from INVITE to BYE, carries the same Call-ID so devices know they're part of the same session.

Via records the route a message took. Each proxy that touches the message adds its address to the Via header. Responses follow the Via headers back like breadcrumbs, ensuring they reach the original sender even through complex network topologies.

From and To identify the caller and the called, using SIP URIs that look like email addresses: sip:alice@atlanta.example.com.

This design, stolen shamelessly from HTTP and email, made SIP feel familiar to web developers. The Internet Engineering Task Force (IETF), which standardized SIP, also maintains HTTP and DNS. SIP was designed by the same community that built the web.10

Security Considerations

Port 5060 is a target.

SIP traffic on port 5060 is unencrypted by default. Your call setup messages, including caller IDs, phone numbers, and network information, travel in plaintext. Anyone on the network path can read them.11

The attack pattern is well-documented:12

  1. Scanning — Attackers sweep IP ranges looking for devices responding on port 5060
  2. Enumeration — Once found, they probe the SIP server to discover extensions and configuration
  3. Brute force — They attempt to guess passwords on discovered extensions
  4. Abuse — With access, they make unauthorized calls, often to premium rate numbers, generating revenue for the attacker and bills for the victim

Port 5060 has consistently ranked among the top-10 targeted ports in honeypot research.13 VoIP toll fraud costs businesses billions annually.

The defenses exist:

Port 5061 uses TLS encryption for SIP signaling. This is the secure version, but it's not enabled by default on most systems.14

SRTP (Secure Real-time Transport Protocol) encrypts the actual audio. TLS protects signaling; SRTP protects voice. You need both for true security.15

Strong passwords matter more than usual. SIP brute force attacks are automated and relentless. Twelve characters minimum, with complexity.

Access control lists limit who can reach your SIP infrastructure. If your VoIP system only needs to talk to your carrier, don't let the entire Internet reach port 5060.

Most VoIP systems ship insecure by default. Encryption must be explicitly enabled. This is a design failure the industry has not corrected.

SIP Today

SIP is everywhere, hiding in plain sight.

Every time you make a phone call on a modern smartphone, you're using SIP. VoLTE (Voice over LTE) uses SIP over the IMS infrastructure your carrier operates.16 When 5G networks carry voice (VoNR, Voice over New Radio), they still use SIP.17

When enterprises connect their phone systems to carriers, they use SIP trunks. When video conferencing platforms set up calls, many use SIP under the hood. When WebRTC applications in your browser need to connect to traditional phone networks, they go through SIP gateways.18

The protocol designed in the 1990s by researchers who wanted the Internet to have its own way of making calls now carries virtually all voice traffic on Earth.

PortProtocolRelationship
5061SIP-TLSEncrypted SIP signaling
5060SIPUnencrypted signaling (this port)
1720H.323The competitor SIP defeated
16384-32767RTPDynamic ports for actual voice/video media

Frequently Asked Questions

A Final Thought

There's something beautiful about a protocol that says "INVITE" when it wants to start a conversation. Not CONNECT. Not ESTABLISH_SESSION. INVITE. Like a human asking another human if they'd like to talk.

And when the conversation ends, it says "BYE."

Port 5060 carries these small courtesies, billions of times a day, setting up the conversations that connect us. The voice itself travels elsewhere, but the invitation, the acceptance, and the farewell pass through here.

Every call begins with an INVITE. Every call ends with a BYE. The protocol that carries human connection was designed with human words.

Was this page helpful?

😔
🤨
😃