Decoding WebRTC's Promise & Challenges
WebRTC will be available -- that is, downloaded and installed -- on over 4 billion devices within the next three years, according to the International Telecommunication Union (ITU)'s projections.
What this means is that all of the complex voice, video, and data handling for communication between two devices over the Net are likely to become mere elevator music for developers -- ever present, but largely unnoticed and hardly a nuisance.
All of the arcane knowledge that previously had been guarded by the few will be part of the common infrastructure. There will be no need to understand and deal with echo-cancellation, backlighting, rotation distortion, frame stabilization, or any of the other myriad challenges posed by real-time communication. Much like building web pages, developers will only need to issue a few high-level commands, and voila -- the communication channel will be established.
There are, however, some potential bumps along this idyllic road to ubiquitous IP communication. WebRTC does indeed make the underlying call handling much simpler, and mundane uses are fairly simple to realize. A web page, for example, can easily open a voice or video session to a specific server. But for more complicated scenarios, challenges set in.
Signaling, for example, is one of the major issues that may come up for developers looking to create a full-featured communication package based on WebRTC. Call sessions are established on two plains -- the media and the signaling. Media handling and all that surrounds the transmission of the actual real-time content, such as voice or video packets, is precisely what WebRTC does so well.
Signaling, on the other hand, is left somewhat vague and undetermined in WebRTC, yet it is a crucial part of the total picture. Signaling deals with call setup and teardown, with registration of a user, and with discovering the called party. Since the late 90s, the protocol of choice for signaling over IP has been the Session Initiation Protocol, or SIP. SIP uses globally unique identifiers (think email address or phone number) to route calls and has a set handshaking process for notifying, accepting, rejecting, and modifying calls, etc.
There are a number of companies already taking this route and offering these types of conversion servers. Each uses its own API flavor on the browser-facing side, but adheres to the SIP standard on the network.
It is still the early days for WebRTC, and it remains to be seen whether one method of choice for signaling will emerge. Until then, developers will have to make due without hard and fast standards. Besides, as famous late night talk show host, David Letterman, once quipped, "Traffic signals in New York are just rough guidelines."
— Baruch Sterman, PhD, VP Technology Research, Vonage