WebRTC
Introduction
WebRTC stands for web real-time
communications. It is a very exciting, powerful, and highly disruptive
cutting-edge technology and standard. WebRTC leverages a set of plugin-free
APIs that can be used in both desktop and mobile browsers, and is progressively
becoming supported by all major modern browser vendors. Previously, external
plugins were required in order to achieve similar functionality as is offered
by WebRTC.
WebRTC leverages multiple standards and
protocols, most of which will be discussed in this article. These include data
streams, STUN/TURN servers, signaling, JSEP, ICE, SIP, SDP, NAT, UDP/TCP,
network sockets, and more.
Peer-To-Peer
Communication
WebRTC can be used for multiple tasks, but
real-time peer-to-peer audio and video (i.e., multimedia) communications is the
primary benefit. In order to communicate with another person (i.e., peer) via a
web browser, each person’s web browser must agree to begin communication, know
how to locate one another, bypass security and firewall protections, and
transmit all multimedia communications in real-time.
One of the biggest challenges associated
with browser-based peer-to-peer communications is knowing how to locate and
establish a network socket connection with another computer’s web browser in
order to bidirectional transmit multimedia data. When you visit a web site, you
typically enter a web address or click a link to view the page. A request is
made to a server that responds by providing the web page (HTML, CSS, and
JavaScript). The key here is that you make an HTTP request to a known and easily
locatable (via DNS) server and get back a response (i.e., the web page).
Firewalls
and NAT Traversal
Most of us access the internet from a work
or home-based network. Our computer typically sits behind a firewall and
network access translation device (NAT), and therefore is not assigned a static
public IP address. From a very high level, a NAT device translates private IP
addresses from inside a firewall to public-facing IP addresses. NAT devices are
needed for security and IPv4 limitations on available public IP addresses.
Here is an example of NAT at work: suppose
you’re at a coffee shop and join their WiFi, your computer will be assigned an
IP address that exists only behind their NAT, say 172.0.23.4. To the outside
world, however, your IP address may actually be 164.53.27.98. The outside world
will therefore see your requests as coming from 164.53.27.98, but the NAT
device will ensure responses to your requests are sent to 172.0.23.4 through
the use of mapping tables. Note that in addition to the IP address, a port is
also required for network communications, and the required knowledge of an accompanying
port is therefore implied throughout this article.
Given the involvement of a NAT device, how
do I know my mom’s IP address to send audio and video data to, and likewise,
how does she know what IP address to send audio and video back to?
This is where STUN (Session Traversal
Utilities for NAT) and TURN (Traversal Using Relays around NAT) servers come
into play. In order for WebRTC technologies to work, a request for your
public-facing IP address is first made to a STUN server. Think of it like your
computer asking a remote server, “Howdy, would you mind telling me what IP
address you see me as having?”. The server then responds with something like,
“Sure your IP address is 198.54.5.67”.
Assuming this process works and you receive
your public-facing IP address and port, you are then able to tell other peers
how to contact you directly. These peers are also able to do the same thing
using a STUN or TURN server and can tell you what address to contact them at as
well.
Signaling,
Sessions, and Protocols
The network information discovery process
described above is one part of the larger topic of signaling, which is based on
the JavaScript Session Establishment Protocol (JSEP) standard in the case of
WebRTC. Signaling involves network discovery and NAT traversal, session
creation and management, communication security, media-capability metadata and
coordination, and error handling.
Signaling is not specified by the WebRTC
standard, nor implemented by its APIs in order to allow flexibility in the
technologies and protocols used. Signaling and the server that handles it is
left to the WebRTC application creator to sort out.
Assuming that your WebRTC browser-based
application is able to determine it’s public-facing IP address using STUN as
described, the next step is to actually negotiate and establish the network
session connection with your peer. This process is analogous to making a phone
call.
The initial session negotiation and
establishment happens using a signaling/communication protocol specialized in multimedia
communications. This protocol is also responsible for governing the rules by
which the session is managed and terminated.
One such protocol is the Session Initiation
Protocol (aka SIP). Note that due to the flexibility of WebRTC signaling, SIP is
not the only signaling protocol that can be used. The signaling protocol chosen
must also work with an application layer protocol called the Session
Description Protocol (SDP), which is used in the case of WebRTC. All
multimedia-specific metadata is passed using the SDP Protocol.
Any peer (i.e., WebRTC-leveraging
application) that is attempting to communicate with another peer generates a
set of ICE candidates, where ICE stands for the Interactive Connectivity
Establishment protocol. The candidates represent a given combination of IP
address, port, and transport protocol to be used. Note that a single computer
may have multiple network interfaces (wireless, wired, etc.), so can be
assigned multiple IP addresses, one for each interface.
Advantages of WebRTC
1. Open source code
WebRTC
is an open source code based project intended for data streaming between apps
and browsers. This new communication standard is supported using the
peer-to-peer technology. Google is the original developer of this technology,
but today WebRTC is supported not only by Google Chrome, but also Opera and
Firefox browsers. Other browsers can support WebRTC as well, after installing
the additional extension webrtc4all.
2. Strong rival to
classic telephony
Today, WebRTC is still a new
experimental technology. However, it is forecasted that after standardization
and certain improvements, this new communications standard will put pressure on
the market of classic telephony. In fact, classic telephony already feels
serious competition from more quality and cheaper VoIP services, such as Viber
and Skype.
3. More security and
stability
Despite
the fact that this new communication standard is still in the process of
refinement and development, there are certain clear advantages of WebRTC over
the Flash technology. The WebRTC architecture is considered to have fewer
disadvantages than the Flash plugin and to be more logical. Flash has dominated
the market until recently, but it has been discontinued from the main web
browsers such as Chrome and Firefox. When it comes to browser security and
stability against external attacks, WebRTC is certainly the best choice.
4. Better sound
quality
Among
the benefits of WebRTC is also included the fact that, particularly due to the
adjustable built-in microphone settings, this technology provides better sound
quality than Flash. WebRTC technology uses G.711 and Opus codecs for
transferring audio.
5. Supported by most
leading Windows browsers
The
many advantages of using WebRTC and the platform’s open source code make the
interest in this technology from different businesses to continue growing. Many
companies consider using independent solutions as being strategically
profitable. This technology is ready to be implemented by some WebRTC
developers into existing online business. Today WebRTC API is supported by most
leading Windows browsers, including Google Chrome, Opera beta.
6. Interoperability with VoIP and
video
The biggest value of WebRTC is its promise of interoperability with existing voice and video systems. This includes devices using SIP, Jingle, XMPP, and the PSTN. What may hinder the global interoperability will be the upgrades necessary in exiting devices. Alternately, gateways can be the solution to interoperability. Some are already on the market. If the existing voice and video devices using standard protocols, they will probably work with WebRTC-based devices.
The biggest value of WebRTC is its promise of interoperability with existing voice and video systems. This includes devices using SIP, Jingle, XMPP, and the PSTN. What may hinder the global interoperability will be the upgrades necessary in exiting devices. Alternately, gateways can be the solution to interoperability. Some are already on the market. If the existing voice and video devices using standard protocols, they will probably work with WebRTC-based devices.