Session Initiation Protocol (SIP)
Session Initiation Protocol (SIP) was designed from the bottom up to connect people and devices whenever and wherever they are in order to engage in a (possibly lengthy) exchange of information. Existing protocols, such as HTTP and SMTP, were not purpose-built for this essential human activity, and so SIP was born to fill the gap. However, SIP borrows from these two other protocols heavily: from using HTTP's message exchange pattern, message format and encoding, to SMTP's URI scheme.
In 2002 a revised version of the SIP standard was formalised into the Internet Engineering Task Force's (IEFT) standardisation process as RFC3261. Because of the open nature of the IETF standards process, the fact SIP is text based and shares many features with existing specifications, it has been readily understood, extended and implemented.
Since its emergence SIP has gained traction as a facilitator of instant messaging (e.g. Windows Messenger) and VoIP (e.g. most well-known platforms apart from Sykpe).
Entities
A SIP environment consists of a number of connected entities.
- A User Agent (UA) is the entity which represents an end user in a client device. It usually operates in two modes: a User Agent Client (UAC) sends the initial request messages and processes responses; and a User Agent Server (UAS) accepts requests and sends responses.
- Proxy Servers are involved in routing the SIP messages to the correct endpoint. Stateful proxies sometime make use of User Agents in a logical entity called a Back-To-Back-User-Agent.
- Redirect Servers provide a new address or different route path to the recipient. The server may make use of a Location Server to persist location information.
- A Registrar acts as current repository of a client's attachment to the network.
It is the User Agent that tends to reside on the end user's device. The other entities provide essential support services in many scenarios.
Messages
SIP messages come in two flavours.
- Request: sent from client to a server and define the operation sought by the client.
- Response: sent from server to a client and provide the status of that request.
Request
A SIP request is characterised as a method much like HTTP, and is considered a 'verb', since it requests actions to be performed by other User Agents or servers. RFC3261 defines six methods (the first six in Table 1) with subsequent standards defining the remaining extension methods (from INFO onwards).
Table 1. SIP Methods.
Method |
Description |
INVITE |
Used to set up a SIP session. Session parameters are negotiated. |
REGISTER |
Authenticates the User Agent and provides a current location to the network. |
BYE |
Terminates a open session. |
ACK |
Confirms a success response to an INVITE. The third part to a three-way-handshake. |
CANCEL |
Cancels an open request. BYE should be used to cancel (tear down) an existing request. |
OPTIONS |
Queries the capabilities of correspondents. |
Extension Methods |
INFO |
Provides mid-call session-related information. It is rarely used. |
MESSAGE |
Used to transfer Instant Messages. |
NOTIFY |
Publishes the outcome of events. Used in combination with SUBSCRIBE requests. |
PRACK |
A Provisional Response ACKnowledgment. Confirms receipt of a provisional response. |
PUBLISH |
Publishes status information. Used for Instant Messaging presence services. |
REFER |
Mechanism to pass a request to someone more appropriate to deal with it. |
SUBSCRIBE |
Used to request receipt of future NOTIFY or PUBLISH requests. |
UPDATE |
Modifies session parameters in mid-call. |
Responses
SIP Response messages are always sent in reply to a request. They convey status updates, confirmations, directions and error codes back to the UAC originating the request. Responses are characterised as either provisional or final and every response must be identified by a 3-digit code.
Response Types
Six classes of response have been defined and are categorised using the 3-digit code. The first five are borrowed from HTTP; the sixth is new to SIP.
Table 2.
Response Classes.
Class |
Description |
1xx Provisional |
Confirms receipt of request and processing is continuing. Provisional responses to INVITEs are never ACKed. |
2xx Success |
The request was received, processes and accepted. |
3xx Redirection |
Provides location information or alternative services to try. |
4xx Request Failure |
The request contained an error or cannot be processed by the server. |
5xx Server Failure |
The server is unable to fulfil the request because of an internal error. |
6xx Global Failure |
No service can be found to fulfil the request. |
Within each class, numerous response codes have been predetermined - some copied from HTTP.
Table 3.
Sample Response codes.
# |
Reason Phrase |
Description |
100 |
Trying |
The next hop received the request. |
180 |
Ringing |
Attempting to alert the user. |
182 |
Queued |
Temporarily unavailable and request is in a queue (not rejected). |
200 |
OK |
The request has succeeded. |
301 |
Moved Permanently |
User is no longer available at the address given in the Request URI. |
302 |
Moved Temporarily |
Retry the request at a new address given in the Contact header. |
400 |
Bad Request |
Could not understand or process correctly the request. |
401 |
Unauthorised |
The request either failed authentication or needs more information. |
403 |
Forbidden |
The server is refusing to process the request. Do not retry. |
404 |
Not Found |
The server cannot identify the user in its domain. |
408 |
Request Timeout |
The server could not process the request in a reasonable time. |
415 |
Unsupported Media |
The format is not supported by the server. |
480 |
Temporarily Unavailable |
The called party is currently unavailable. |
485 |
Ambiguous |
The Request URI is ambiguous. |
486 |
Busy Here |
The called party is currently not willing or able to take the call. |
500 |
Server Internal Error |
The server encountered an unexpected condition. |
513 |
Message Too Large |
The message length exceeded a determined limit. |
603 |
Decline |
The user explicitly refused to accept the request. |
Warning Header Field
The
Warning header field is used to carry additional information about the status of the response. The header defines a 3-digit code between 300 and 399, the host name and a warning text.
Warning: 307 isi.edu "Session parameter 'foo' not understood"
Anatomy of a Message
Each SIP message begins with a Start-Line, is followed by a sequence of headers, and separated from the message body by a carriage-return line-feed sequence (CRLF).
- Start-Line: formatted as a Request-Line for Requests or a Status-Line for Reponses.
- Headers: Named attributes that provide additional information about the message.
- Separator Line: Separator between header and body.
- Body: binary or textural payload. Typically Session Description Protocol (SDP) or a message text.
The start line, each header line and the separator line is terminated by a [CRLF] sequence.
Start Line
The start-line conveys the type of message and protocol version. For both Request (Request-Line) and Responses (Status-Line), the start-line has three elements separated by spaces.
- Request-Line: Contains a method, URI and ends with the protocol version ("SIP/2.0").
INVITE sip:[email protected] SIP/2.0
- Status-Line: Starts with the protocol version, followed by a numeric status code and is completed with a short textural reason.
SIP/2.0 200 OK
Headers
Headers follow the same generic header format as HTTP. Each consisting of a case-insensitive ASCII encoded name and colon followed by a value which is sometimes UTF8 encoded and usually case-sensitive. Each header can have one or more semi-colon separated parameters appended to the value, providing additional tags and features.
header-name: header-value(;parameter-name=parameter-value)*[CRLF] Each header can be separated on to different lines using a [CRLF][TAB or SPACE] sequence (known as folding). What's more, multiple headers with the name same e.g.
Contact can appear on separate lines, or, can be placed on the same line separated by commas. For example:
Contact: <sip:[email protected]>
Contact: <sip:[email protected]>
Can be represented as:
Contact: <sip:[email protected]>, <sip:[email protected]>
Or using folding:
Contact: <sip:[email protected]>,
<sip:[email protected]>
Body
A message body describes the session (using SDP) or contains opaque text or binary body parts containing the payload related to the session (e.g. MIME or Message formats). Bodies can appear in request or response messages.
Header Fields
In the following examples, Alice makes a call to Bob using his SIP URI, 'sip:
[email protected]'. Bob answers Alice with a success Response. The message is an example of an INVITE request containing an SDP message being responded to with a "200" OK response.
Note: All the code examples are fairly language agnostic. A list of recommended Java, .NET and C++ APIs is given at the end for those interested in exploring SIP further. And all the code should work with any of then with little annotation (I've happened to have used Konnetic's C# SIP API in this case).
Note: Alice calling Bob is SIP's equivalent of the archetypal Hello World applications.
Creating the Request Message
- Add the Request Line, which indicates the message is an INVITE request to 'sip:[email protected]'.
Invite invite = new Invite(new SipUri("sip:[email protected]"));
- Create the Via header, the Via header indicates to the recipient the return path.
invite.ViaHeaders.Add(new ViaHeaderField("122.181.8.8:11506",
SipTransportProtocol.Udp));
- Create the addresses of the sender and recipient. The SipUris can be an IP address, but Fully Qualified Domain Names are recommended. Display names are possible. For security reasons the From header is allowed to by anonymous if desired.
invite.From.Uri = new SipUri("sip:[email protected]");
invite.From.DisplayName = "Bob";
invite.From.Tag = "769122";
invite.To.Uri = new SipUri("sip:[email protected]");
invite.To.DisplayName = "Alice";
- Create the unique identifiers for the call and the conversation. The CallId is a unique value for the session. The sequence is incremented in subsequent Requests. The To, From, and Call-ID tuple provides a unique key for a call.
invite.CallId.CallId = "[email protected]";
invite.CSeq.Sequence = 3434534;
invite.CSeq.Method = SipMethod.Invite;
- Create the alternate contact information for the sender.
invite.ContactHeaders.Add(new ContactHeaderField(
new SipUri("sip:[email protected]")));
- Finally add the content definitions. We will omit the content in this example.
invite.ContentType.MediaType = "application";
invite.ContentType.MediaSubType = "sdp";
invite.ContentLength = 136;
SIP request message.
The resulting SIP request message should look similar to the following:
INVITE sip:[email protected] SIP/2.0
Via: SIP/2.0/UDP 124.191.8.8:11506
Max-Forwards: 70
To: Bob <sip:[email protected]>
From: Alice <sip:[email protected]>;tag=769122
Call-ID: [email protected]
CSeq: 3434534 INVITE
Contact: <sip:[email protected]>
Content-Type: application/sdp
Content-Length: 136
Creating the Response Message
If you recall in this example Bob answers Alice with a success Response. The message is an example of a "200" OK response.
- Add the Status Line, which indicates the request was a Success.
Response okMessage = new Response(StandardResponseCode.Ok);
- Copy over the Via header from the Request message.
okMessage.ViaHeaders.Add(new ViaHeaderField("122.181.8.8:11506",
SipTransportProtocol.Udp));
- Copy over the address information. The To and From fields stay the same as the original except for a tag parameter on the To field. They do not swap for the Response. You should think of them as the original To and From fields.
okMessage.From.Uri = new SipUri("sip:[email protected]");
okMessage.From.DisplayName = "Bob";
okMessage.From.Tag = "769122";
okMessage.To.Uri = new SipUri("sip:[email protected]");
okMessage.To.DisplayName = "Alice";
okMessage.To.Tag = "abgj67";
- Copy over the identifier from the Request message.
invite.CallId.CallId = "[email protected]";
invite.CSeq.Sequence = 3434534;
invite.CSeq.Method = SipMethod.Invite;
- Add the alternate contact information. This time it is for Bob.
okMessage.ContactHeaders.Add(new ContactHeaderField(
new SipUri("sip:[email protected]")));
- Finally add the content definitions.
okMessage.ContentType.MediaType = "application";
okMessage.ContentType.MediaSubType = "sdp";
okMessage.ContentLength = 132;
I should note that a good library will provide APIs that automate much of the boilerplate code shown above. For example, the copying of fields from the Request to the Response would take place in the library.
SIP response message.
The resulting SIP response message should look similar to the following:
SIP/2.0 200 OK
Via: SIP/2.0/UDP 124.191.8.8:11506
To: Bob <sip:[email protected]>;tag=abgj67
From: Alice <sip:[email protected]>;tag=769122
Call-ID: [email protected]
CSeq: 3434534 INVITE
Contact: <sip:[email protected]>
Content-Type: application/sdp
Content-Length: 132
Call Flow Example
This section details a call flow between that same two SIP User Agents as above and could use the same message structures. The successful calls show the initial signalling, the establishment of the media session, then finally the termination of the call.
Figure 1.
Alice completes a SIP call with Bob, exchanges media packets, then Bob terminates the call.
Session Setup
- Alice's UA sends an INVITE message to Bob's SIP address (i.e. 'sip:[email protected]'). The message contents are a Session Description Protocol message describing the expected media exchange.
- Bob's UA receives the INVITE and responds with a 100 Trying message.
- The UA then attempts to attract the attention of Bob, and simultaneously sends a 180 Ringing message to Alice.
- Bob respond and is UA sends a 200 OK message. The 200 OK contains the SDP message Bob is agreeing to.
- Finally, Alice's UA acknowledges receipt of the OK with an ACK request.
- Media streams are established directly between Alice and Bob.
We have now paved the way for another protocol (such as RTP) to transport the media data directly between Alice and Bob without further intervention from SIP. This transfer takes place in what is known as the Bearer or Transport Plane. With SIP acting within the Control or Signalling Plane.
Session Tear Down
At the end of the call, SIP is used to tear down the session. If, for example, it is Bob who ends the call the exchange would be as follows:
- Bob hangs up and his UA initiates a session termination by sending a BYE request to Alice.
- Alice's UA response with a 200 OK.
In the above example Alice and Bob carry on a generic "media" exchange. However, this could easily represent a voice-call, video-conferencing or instant messaging session; the procedure would look exactly the same.
SIP Libraries and APIs
The list is not exhaustive; they are simply the best ones I've come across. A sensible criterion to use for a good SIP API is whether it provides support for strongly typed header fields; implements SIP transactions; and encapsulates most of the SIP URI internals.
Table 4.
Recommended SIP stacks
Test Tools
Testing tools are essential as sticking to the standard is demanded for any application interacting with other SIP applications and servers.
Table 5.
Recommended SIP testing tools
Tool |
Description |
SIPp |
A SIP traffic generator from HP. |
PROTOS |
Testing app from the University of Oulu, Finland. |
ETSI TS 102 027-2 |
List of SIP test call flows. |
Further Reading
H. Schulzrinne. (2010) Henning Schulzrinne. [Online].
http://www.cs.columbia.edu/~hgs/ A. B. Johnston, SIP: Understanding the Session Initiation Protocol, 3th ed. Boston, MA, USA: Artech House, 2007.
J. Rosenberg et al.,
"SIP: Session Initiation Protocol" RFC3261 2002.
IETF. (2010) Multiparty Multimedia Session Control (mmusic). [Online].
http://datatracker.ietf.org/wg/mmusic IETF. (2010) Session Initiation Protocol (sip). [Online].
http://datatracker.ietf.org/wg/sip/ H. Schulzrinne et al.,
"RTP: A Transport Protocol for Real-Time Applications" RFC3550 2003.