Why SIP?
Almost every new product in the UC world is based on the Session Initiation Protocol (SIP). For the un-initiated, SIP is one of the most commonly used protocols for signaling real-time sessions – like setting up voice and video calls – over IP networks.
For newcomers to the space, SIP may seems like an obvious choice, but this was not always the case. In the early days of unified communications at Microsoft, David Gurle (now VP of Collaboration at Reuters) was a champion of SIP as a strategy to simplify and consolidate a diverse set of protocols and products into a coordinated and standards-based technology platform. This decision paid off and paved the way for what we now know as Office Communications Server and Office Communicator.
To understand the industry battle that SIP had to fight over entrenched protocols like H.323, it helps to understand the strengths a weaknesses of both.
SIP vs. H.323
Over at Cisco’s Technical Help blog, there is a really good post “H.323 versus SIP: A Comparison“. The article gives a blow-by-blow account of the relative strengths and weaknesses of each.
The article raises very good technical points. I won’t go through each of them – you can read it for yourself – but the gist of it is that H.323 is a better controlled standard and therefore provides better interoperability between products.
When I worked at Microsoft, I was involved in the development, testing and support of systems using H.323 (Microsoft TAPI, Microsoft NetMeeting and the now defunct Exchange Conferencing Server) as well as SIP (LCS, OCS, and Office Communicator). The bottom line: as a product developer, I would choose SIP any day of the week over H.323.
Why? This seems to contradict the point made by the other article about the robustness and tight interoperability of H.323 based products. In simple terms…
My protocol is better than your protocol
SIP is “Internet-like”. Since it’s standardised by the IETF, you can easily develop specs to extend it, and ratification of extensions occurs relatively quickly. Also, it looks like HTTP on the wire. The messages are text based and that makes it very easy to understand and troubleshoot. This opens the playing field up to an entire community of developers may otherwise not get involved.
H.323 is “telephony like”. Its standardised by the ITU, which is a much more rigorous and process oriented standards body. This improves interoperability between different vendor’s products but slows down the pace of innovation. Also, it’s use of ASN.1 (binary) encoding for messages means that you need special parsers just to read the messages if you are debugging problems.
The implied technical superiority of H.323 is neither here nor there. If you were starting a new company building something on H.323, you’d be building something that’s already been built before. It’s the extensibility of SIP that makes it so appealing to developers.
The Spectrum: Proprietary to Open Standards
The real discussion should be between standardised protocols like SIP and H.323 and proprietary protocols like Skype.
A Skype representative at the VoN conference this year said it best: “My mother uses Skype – why bother with standards?” http://von.blip.tv/file/191288
The best way to think of this is to view the technical landscape as a spectrum. Proprietary protocols are on the ultraviolet end of the spectrum – They will give you ultimate flexibility, but zero cross-vendor interoperability. If you want to build something completely new, you may have to invent your own protocol. This removes a lot of initial technical hurdles but transfers the challenges to the business – you’ll need to build your own network and ecosystem since no one else interoperates with you. As the ecosystem grows, you’ll need to build everything in the ecosystem yourself.
H.323 is closer to the other end – It has less flexibility, but great interoperability.
So Why SIP? In a Nutshell…
SIP strikes a good balance between flexibility and interoperability, and that’s why it’s so popular.
The momentum of SIP is real. SIP is being used as a de facto standard within IP Multi-Media Subsystem (IMS) architecture. In the UK, BT’s pervasive 21cn network will use SIP exclusively for signaling (that’s that plan anyway).
If you’re going to build a new product, why not choose a protocol that let’s you have your cake and eat it too? After all, its not the protocol, but what you do with it that counts.
-John Lamb, Modality Systems