More and more people and organisations are moving to VoIP for all sorts of different reasons. Where I work we’re looking to implement VoIP to expand our range of services and as an extra option we can add to various bespoke solutions we work on. But how does that call get made, how does the phone connected to our network manage to call another phone connected to our network or another phone connected somewhere out there on the Internet? Hopefully this should show the steps that are taken to setup and terminate a phone call using SIP.
SIP or Session Initiation Protocol is a signalling protocol used to create, modify and terminate media streams, in our case this would be RTP between two VoIP / SIP phones.
RTP or Real-time Transport Protocol is the protocol used to send voice and video data between two endpoints such as two VoIP / SIP phones. RTP is used in conjunction with RTCP.
RTCP or Real-Time Transport Control Protocol works along side RTP and provides statistics and control information to the endpoints about the media stream(s) transported by RTP so that the endpoints may control the QoS parameters of the stream.
QoS (Quality or Service) is the ability to provide different priority to different types of traffic or applications on our network or to guaranteed a specific level of performance to a particular type of traffic or application.
VoIP (Voice over Internet Protocol) is a family of technologies that includes SIP, RTP and RTCP to provide the delivery of voice over our existing data networks and the Internet.
In this blog post there isn’t going to be any configuration as such. We’ll use the following topology as we go through the steps involved with the setup of a SIP call.
1. When the two phones are connected they will first connect to the network, and if an IP address is not statically set, they will attempt to obtain an address using DHCP, exactly like any PC on the network would do. They then register with the SIP Proxy Sever, this is like the phones logging onto the server, information about the phone is recorded in the location database such as the address of the phone, the IP address etc. The address of the phone is referred to as the AOR or Address-of-Record and might look something like sip:email@example.com or firstname.lastname@example.org.
2. When the user with extension 1111 picks up the phone and dials 2222, the phone sends an invite message to the SIP Proxy Server. The server then looks up the destination and sends an invite message to the phone with extension 2222.
3. The phone at extension 2222 will then send back a trying message to the server, the server then sends the trying message to the calling phone. The trying message means that the phone is trying to ring, almost like asking the source phone to wait while to call is connected.
4. The phone at extension 2222 will then start to ring to let the user know there is a call and will send a ringing message to the server, again the server will then send a ringing message back to the phone with extension 1111. Once this phone receives this message it will play a ringing sound from the ear piece.
5. The user at extension 2222 will then pick up the handset to answer the call and the phone will then send the OK message to the server to let it know that the call has been answered, the server then sends the OK message to the phone at extension 1111 to let it know that the call has been answered by the other side.
6. The phone at 1111 now sends an ACK message to the phone at extension 2222, the two phones will now setup a media session between themselves and will being to send audio (and video) directly between each other using RTP.
7. When the call is finished the handset is hung up and this phone sends a bye message to the other phone, this is acknowledged with a final OK message before the media session is closed and both phones are now ready to place or accept a new call.
We’ve take a brief look at how a SIP call is setup, looking at the protocols used — such as RTP — and what messages are sent between the phones to the server during the call initialization. We’ve also seen that the actual voice call does not pass through the SIP server but instead is sent directly between the two phones using the media session setup at the end of the the SIP call setup and when the media session is closed.
I hope you found this post informative, please leave a comment if you have any questions or feedback.