Session Border Controllers and NAT Traversal

Directorio de Locutores

Session Border Controllers and NAT Traversal

Introduction

Network Address Translators (NAT) are used to overcome the lack of IPv4 address availability by hiding an enterprise or even an operator’s network behind one or few IP addresses. The devices behind the NAT use private IP addresses that are not routable in the public Internet.

The Session Initiation Protocol (SIP) [3]has established itself as the de facto standard for voice over IP (VoIP) communication. In order to establish a call, a caller sends a SIP message, which contains its own IP address. The callee is supposed to reply back with a SIP message destined to the IP addresses included in the received SIP message. This will obviously not work if the caller is behind a NAT and is using a private IP address.

Probably the single biggest mistake in SIP design was ignoring the existence of NATs. This error came from a belief in IETF leadership that IP address space would be exhausted more rapidly and would necessitate global upgrade to IPv6 and eliminate the need for NATs. The SIP standard has assumed that NATs do not exist, an assumption, which turned out to be a failure. SIP simply didn’t work for the majority of Internet users who are behind NATs. At the same time it became apparent that the standardization life-cycle is slower than how the market ticks: Session Border Controllers (SBC) were born, and began to fix what the standards failed to do: NAT traversal.

In this paper we give first a short introduction to SIP and then describe how session border controllers enable SIP calls to be established across NATs.

A Short Introduction to SIP

By the mid nineties the IETF, which is playing the role of the standards organization of the Internet, had already produced different protocols needed for IP-based telephony services. The Real-Time Transport Protocol (RTP) [1] enabled the exchange of audio and video data. The Session Description Protocol (SDP) [2] enabled the negotiation and description of multimedia data to be used in communication session.

The Session Initiation Protocol (SIP) [3] was the attempt of the IETF community to provide a signaling protocol that will not only enable phone calls but can be also used for initiating any kind of communication sessions. Hence, SIP can be used for VoIP just as well as for setting up a gaming session or controlling a coffee machine.

The SIP specifications describe three types of components: user agents (UA), proxies and registrar servers. The UA can be the VoIP application used by the user, e.g., the VoIP phone or software application. A VoIP gateway, which enables VoIP users to communicate with users in the public switched network (PSTN) or an application server, e.g., multi-party conferencing server or a voicemail server are also implemented as user agents.

The registrar server maintains a location database that binds the users’ VoIP addresses to their current IP addresses.

The proxy provides the routing logic of the VoIP service. When a proxy receives a SIP request from a user agent or another proxy it also conducts service specific logic, such as checking the user’s profile and whether the user is allowed to use the requested services. The proxy then either forwards the request to another proxy or to another user agent or rejects the request by sending a negative response.

With regard to the SIP messages we distinguish between requests and responses. The INVITE request is used to initiate a dialog between two users. A BYE request is used for terminating this dialog. Responses can either be final or provisional. Final responses can indicate that a request was successfully received and processed by the destination. Alternatively, a final response can indicate that the request could not be processed by the destination or by some proxy in between or that the session could not be established for some reason. Provisional responses indicate that the session establishment is in progress, e.g. the destination phone is ringing.

In this paper we distinguish three types of SIP message exchanges, namely registrations, dialogs and out of dialog transactions.

A SIP registration enables a user agent to register its current address, IP address for example, at the registrar. This enables the registrar to establish a correlation between the user agent’s permanent address, e.g. sip:user@frafos.com, and the user agent’s current address. In order to keep this correlation up to date the user agent will have to repeatedly refresh the registration. The registrar will then delete a registration that is not refreshed for a while.

A SIP dialog, a call for example, usually consists of a session initiation phase in which the caller generates an INVITE that is responded to with provisional and final responses. The session initiation phase is terminated with an ACK. A dialog is terminated with a BYE transaction. Depending on the call scenario the caller and callee might exchange a number of in-dialog requests such as reINVITEs or REFER.

The last type of SIP interactions is SIP transactions that are not generated as part of a dialog. These out of dialog messages can be observed when the SUBSCRIBE and NOTIFY requests are exchanged between two SIP user agents. This is the case when a SIP node wants to be informed about a certain event. In this case this node sends a SUBSCRIBE request to the server in charge of this event. Once this event occurs, the server will send a NOTIFY request to the SIP node carrying information about the event. Other out of dialog SIP requests include OPTIONS and INFO that are often used for exchanging information between SIP nodes or as an application level heartbeat.

SBC and NAT Traversal

In case a user agent is located behind a NAT then it will use a private IP address as its contact address in the Contact and Via headers as well as the SDP part. This information would then be useless for anyone trying to contact this user agent from the public Internet.

There are different NAT traversal solutions such as STUN [4]and ICE [5]. Which solution to use depends on the behavior of the NAT and the call scenario. When using an SBC to solve the NAT traversal issues the most common approach for SBC is to act as the public interface of the user agents. This is achieved by replacing the user agent’s contact information with those of the SBC.

In order for a user agent to be reachable through the public interfaces of an SBC, the SBC will manipulate the registration information of the user agent. The user includes its private IP address as its contact information in the REGISTER requests. Calls to this address will fail, since it is not publicly routable. The SBC replaces the information in the Contact header with its own IP address. This is the information that is then registered at the registrar. Calls destined to the user will then be directed to the SBC. In order for the SBC to know which user agent is actually being contacted the SBC can keep a local copy of the user agent’s registration. The local copy includes the private IP address and the user’s SIP URI as well as the public IP address included in the IP header that was assigned to the SIP message by the NAT.

Alternatively the SBC can store this information in the forwarded SIP messages. For example the user’s contact information can combined in a special format and added as an additional parameter to the Contact header. The contact information would include the user’s private IP address and SIP URI as well as the public IP address in the IP header of the SIP message. When the registrar receives a request for the user, the registrar will return the complete contact information to the proxy, which will include this information in the SIP message. The SBC can then retrieve this information from the SIP request and use it to properly route the request to the user.

Adding the user agent’s contact information to the registered contact information has many advantages. As the SBC does not have to keep local registration information this solution is simple to implement and does not require memory for keeping the information. Further, requests destined to the user agent do not necessarily have to traverse the SBC that has processed the user agent’s registration messages. Any SBC that can reach the user agent can correctly route messages destined to the user agent based on the information included in the SIP request. This advantage applies, however, only in some cases. In case the NAT used in front of the user agent accepts traffic only from the IP addresses which the user agent has contacted previously then only the SBC that has processed the user agent’s REGISTER requests will be able to contact the user agent.

Keeping a local copy of the registration information increases the processing requirements on the SBC. The SBC will have to manage a local registration database. Beside the memory requirements the SBC will have to replicate this information to a backup system if it is to be highly available. This will further increase the processing requirements on the SBC and increase the bandwidth consumption.

However, keeping a local copy of the registration information has its advantages as well. When receiving a message from a user agent a network address translator binds the private IP address of the user agent to a public IP address. This binding will remain active for a period of time -binding period. In case the user agent does not send or receive any messages for a period of time longer than the binding period then the NAT will delete the binding and the user agent will no longer be reachable from the outside. To keep the binding active, the user agent will have to regularly refresh it. This is achieved by sending REGISTER requests at time intervals shorter than the binding period. As REGISTER messages have to be usually authenticated, having to deal with REGISTER messages sent at a high frequency would impose a high performance hit on the operator’s infrastructure. SBCs can help to offload this load. When a user agent sends the first REGISTER request, the SBC forwards the REGISTER request to the operator’s registration servers. Once the registration was successfully authenticated and accepted by the operator, the SBC will keep a local copy of the registration information. Instead of forwarding each incoming REGIETER request to the operator’s registration servers, the SBC will only send REGISTER requests to the registration servers at rather large time intervals (in the range of hours). Registration requests arriving from the user agent that do not change the content registration information will be replied to by the SBC itself. The SBC will also inform the registration server once the local registration expires or changes.

Similar to the registration case, the SBC will also include itself in the path of INVITE and other request messages. When receiving an INVITE from a user agent behind a NAT, the SBC will include a Via header with its own address, replace the information in the contact header with its own address and also replace the address information in the SDP body with its own address. Thereby, all SIP messages and media packets will traverse the SBC.

NAT Traversal and Media

While NAT traversal of SIP messages may appear complicated after all, the yet more complex task is enabling media to traverse NATs. The initial problem statement is the same. If SIP devices behind NATs advertise their IP addresses, their peers on the other side of NATs cannot route traffic to them.

The solution SBCs came with simply ignores the way SIP works. Instead of sending media to the IP address and port number advertised in the SIP SDP bodies, SBCs send media for a user agent symmetrically back to where the agent has sent its own media from. This symmetric communication typically works because it is the traffic pattern NAT manufactures have been used to before the arrival of VoIP.

It is important to know that while this mostly works, it has several limitations. First of all, it only works with clients that are built «symmetric way», i.e., they use the same port for sending and receiving media. Nowadays that’s fortunately the majority of available equipment.

The other noticeable disadvantage is «triangular routing»: an SBC must relay all VoIP traffic for a call, to make the paths caller-SBC and SBC-callee symmetric. That is in fact quite an overhead for a VoIP operator. With the most common codec, G.711, a relayed call consumes four 87.2 kbps streams: two outbound, two inbound.

Some other disturbing limitations may occur too. For example, if a SIP device uses Voice Activity Detection (VAD) and fails to send any voice packets initially, the SBC will not learn its address and will not forward incoming media to it as well. Also some NATs are simply built in such a poor way, that the only thing which almost always works is HTTP and SIP just fails.

Despite these limitations, SBCs have solved the «NAT problem» in a vast majority of use-cases.

Acronyms

IETF: Internet Engineering Task Force

IP: Internet Protocol

NAT: Network Address Translator

PSTN: Public Switched Telecommunication Network

RTP: Real-Time Transport Protocol

SBC: Session Border Controller

SDP: Session Description Protocol

SIP: Session Initiation Protocol

UAC: User Agent Client

UAS: User Agent Server

UNI: User-Network Interface

VoIP: Voice over IP

References

  1. Schulzrinne, H.; Casner, S.; Frederick, R.; Jacobson, V. «RTP: A Transport Protocol for Real-Time Applications (RFC1889)», IETF, 1996
  2. Handley, Mark; Van Jacobson. «SDP: Session Description Protocol (RFC 2327), IETF, 1998
  3. J. Rosenberg; H. Schulzrinne; G. Camarillo; A. Johnston; J. Peterson; R. Sparks; M. Handley and E. Schooler. «SIP: Session Initiation Protocol (RFC 3261)» IETF, 2002.
  4. J. Rosenberg; R. Mahy; P. Matthews and D. Wing «Session Traversal Utilities for (NAT) (STUN)», RFC5389, IETF, 2008
  5. J. Rosenberg «Interactive connectivity establishment (ICE): a methodology for network address translator (NAT) traversal for the session initiation protocol (SIP)». RFC5245, IETF, 2010



Source by Berthold Butscher

Start typing and press Enter to search

Shopping Cart