Application: Integration of video and real-time data into Web applications
WEB: HTML5, HTTP, WebSockets
Technologie: WebRTC, Streaming, Peer to Peer
Table of contents
When looking at the Web presentations of product suppliers one experiences a growing trend to embedd tools for personal support and service into the business logics of Web applications. The customer is offered a dialogue where he may ask his questions. He gets answers from a personal consultant. On the one hand this relieves a buying decision on the other hand a good personal service is remembered. This human way of communication stands in contrast to exchangeable, impersonal Web presentations of complex products. The "conversation" with a machine is again replaced by a conversation with a human being. The direct dialogue between human beings is revived and gets more and more recognized as a key tool for sustainable customer retention. This stands in contrast to the stereotypical impersonal information presentations of the current internet.
If even a simple text chat has such positive effects how much more benefit could be generated by a face to face communication? Suddenly the company where one buys is not longer just an abstract Internet page but gets a face and even a personal contact with a name. WebRTC (Web Real Time Communication) creates the technical preconditions for such situations.
Let's take other examples - the sensitive dialogue between a doctor and a patient or between a lawyer and its clients. Not each concern of a patient or client requires a direct presence at the doctor or the lawyer. Lots of bagatelles of the daily live may be handled much more efficiently with an online consultation with much lower burden for both sides. Long times in waiting rooms are reduced. The disposability of the doctor or the lawyer is increased. It is possible to serve more clients in the available time. The indirect online meeting can be the pre-stage for the decision whether a direct consultation is really necessary. Text chats or telephone calls are just compromises. Mostly they lack the necessary intimacy of a face to face dialogue. Especially with sensitive topics the personal aspect of a video connection is a substantial factor to establich the necessary trustful relation.
The diversity of potential applications, which base on the interactive personal face to face dialogue is huge. In banking the classical subsidiary structure will be disused in short time. Customer age-groups grow up which have never seen a bank from within. What is to replace the conversation between customer consultant and investor or borrower - virtual WebRTC subsidiaries? Or let's take the communication between a tax consultant and his clients. Lots of situations of everyday life which provide consultancy services are potential domains for WebRTC applications, e. g. the contact of an insurance broker and his customers. Another possibility with interesting perspectives is the application as sales tool for direct marketing purposes. WebRTC could be used for the visual presentation of products or operational instructions, the advice for customers how to apply a product or even for an extended acquisition of customer profile data. Technically it would be possible to take snapshots of a customer during a session and link them with the customer database of the provider - of course only if the client agrees with that.
WebRTC represents the fundament for the foundation and operation of autonomous communities which need an audio visual component as extension for the previous text based communication. The offerer of a service may become a self-managed provider. Interactive language courses or other online trainings are imaginable. WebRTC has the potential to create completely new application domains. Especially the direct combination of the video component with the business logics and data of a supplier is the key to that.
In relation to possible new applications this point is the most interesting compared to previous technologies which just offer very limited possibilities for the realization of own ideas and solutions.
It is possible to create completely independent solutions for the embedding of the dialog function into own portals. This starts with the implementation of the corporate design of a Web presentation and is complemented with the combination of video data with the business data of the solution. The combination of live video with manifold additional data generates added value. It is possible to exchange real-time measurement values, sensor data, stock ticker and bank data directly between the participants of a WebRTC session and to visualize that data with the broad range of layout tools of a WebRTC browser.
The customizability allows the reproduction of typical business processes in the Web browser. An example is the simulation of waiting room situations in online consultations. As in real life the online waiting room has a state like "opened" or "closed" and a list of clients which wait for a meeting with their consultant like a lawyer. The administration of the necessary information for the control of the establishment of consultant-client dialogues may be adapted completely to the requirements of the individual service provider and its business model. The parameters of these consultations like duration, meeting minutes or the start and end time may be automatically registered and combined with other business data. Compared to inflexible solutions like Skype or Facetime this represents a huge gain in flexibility and interactivity.
Based on the WebRTC standards and integration interfaces it is possible to establish autonomous communication networks. It is possible to restrict the communication to certain user groups. Additionally an user dependent individualization of the operation and presentation may be realized. If required each WebRTC service may build up his own Internet infrastructure and may become a provider with own resources. This offers the highest possible degree for the protection of communication data and flexibility to realize own solutions.
All WebRTC data like video, audio, real-time, business or signaling data is encrypted by standardized publicly approved methods. This allows the usage of WebRTC for applications which are highly confidential. Especially that function is the precondition to get WebRTC solutions certified for the application in sensitive scenarios like the doctor/patient or the lawyer/client communication. Other communication techniques use proprietary technologies and rely on non-transparency in relation to confidential data transmissions. This makes their application for the fields discussed at least questionable.
WebRTC represents a so-called peer to peer technology. The data is exchanged directly between the browsers of the meeting partners. A centralized server for the exchange of the real-time data is not necessary. Henceforth there is no bottleneck caused by the limited performance of a server. The number of participants in a WebRTC solution is not limited by centralized resources.
Peer to peer communication is not new compared for example with Skype. Nevertheless WebRTC offers advantages even here compared to proprietary approaches. For example it uses a broad range of standardized technologies from the VoIP sector. This simplifies a seamless communication between WebRTC browser applications and VoIP devices. Conversations between browser and video- or soft-phones or even classical PSTN (Public Switched Telephone Network) subscribers are possible.
Some components of a WebRTC solution are not defined intentionally. The yellow blocks in the image show that. This is to increase the flexbility for possible application scenarios.
A big part of the building blocks of WebRTC is encapsulated by the browser. The Web application developer does not have direct influence on the audio/video compression or the streaming transport. Topics like NAT (Network Address Translation) or encryption are as well part of the browser implementation. The application developer may concentrate on the workflows of his Web application. The biggest challenges for him are the filling of the "standard-gap" signaling and the permanent struggle with browser dependent incompatibilities. At least the last point gets reduced with the increasing maturity of the standard.
The WebRTC standard does not define the methods for audio/video compression. It is just defined how the participants of a WebRTC session inform each other about the methods they support. For that the SDP (Session Description Protocol) is used which is as well a component of VoIP. Despite the fact that the standard does not demand for explicit codecs the browser manufacturers realize de facto standards with their implementations. There is no degree of freedom for the WebRTC application developer in that relation because he must use the methods which are supported by the browsers.
Currently the VP8 video codec has the broadest propagation because Google which drives the WebRTC standardization process uses it. A reason for Googles decision against the H.264 codec are license problems. With homogeneous WebRTC applications the limitation to VP8 is no problem. Problems arise if bridges to other applications or networks like VoIP are necessary. The codec is one reason why a direct communication between WebRTC applications an VoIP devices is problematic. VoIP mostly uses H.264 or even the old H.263 standards. To bridge the gap a centralized transcoding gateway would be necessary. But that would consume a big share of the WebRTC advantages and its peer to peer communication approach.
Like everything with WebRTC the current development is quite dynamic. For example Mozilla has recently announced the alternative support of H.264 and VP8 in Firefox "Support of H.264 with Version 33". Also the newest Microsoft statements let us expect that the Internet Explorer will eventually support WebRTC - but just with H.264 which would prevent a cross-browser communication between Google Chrome and Microsoft Internet Explorer. There is still much politics going on until the standards will enter calmer waters.
Except for special situations (e. g. the peer to peer communication between two browsers which reside in an intranet with well-known IP addresses of the participants) WebRTC requires the so-called signaling.
The signaling's task is to provide the peers of a WebRTC communication with the information which is necessary to establish the actual peer to peer payload data connection. So that two WebRTC clients can create a peer to peer connection they need information about:
Because of the limitation of the IP address range IP addresses for Internet participants are normally assigned dynamically by their providers. There is no guarantee that an Internet subscriber gets the same IP address on successive connections from its provider. Even worse is the widespread application of so-called NAT (Network Address Translation) technologies in the Internet access routers of participants. The router still has a public IP address from the provider. But all participants behind the router reside in an decoupled Intranet which just knows privat IP addresses. It requires additional efforts, tools and infrastructure to get two Internet participants which reside behind different NAT routers directly connected as it is necessary for a peer to peer communication.
For establishing peer to peer connections nevertheless it is necessary to exchange signaling data via a centralized signaling server. This server has a well known IP address and is henceforth directly accessible for all WebRTC clients. It allows the clients to exchange the relevant data as a precondition to establish the real peer to peer call. After that normally the signaling server is not longer necessary except if it provides additional centralized functionality like HTTP Web Server functions.
Despite the peer to peer character of WebRTC a centralized server infrastructure is necessary for the exchange of connection data. The standard just defines the "What" - the information which must be exchanged - but not the "How" - the transport channel.
There are different technologies and protocols which may be used for implementing the transport of the WebRTC signaling data. One of the most frequently used solutions is currently the application of a proprietary transport based on the new HTML5 WebSockets. Alternatives are SIP over WebSockets or AJAX/XHR based approaches - for Web servers which do not support WebSockes like Apache or Internet Information Server.
As explained the today's architecture of the internet introduces a number of challenges for WebRTC which obstruct the direct peer to peer communication between two clients or even make it impossible. NAT is one of the most serious architectural problems of the IPv4 based Internet. Despite the fact that it offers ingenious tools to solve the problem of the address space limitation of IPv4 it is actually just a "workaround". In the long term perspective IPv6 will solve the problems at the very root. But until that the following infrastructure is required to get WebRTC reasonably running. These technologies have evolved as part of the VoIP infrastructure.
For the identification of the address information which may be used for the establishment of direct connections STUN (Session Traversal Utilities for NAT) is used. This requires the hosting of an additional public Internet server as part of a WebRTC solution. There are NAT types where even the STUN approach fails (symmetric NAT). In those rare (statistics tell numbers between 10-15%) cases the last fall back is a TURN (Traversal Using Relays Around NAT) server. TURN uses a relayed transport which routes the payload data between the clients through the TURN server. Of course that has negative consequences for the scalability (limited performance of centralized resources) and latency of the real-time data transmissions. Which type of transport (relayed or peer to peer) is used is decided by the browser. For that it uses a try and error method (ICE - Interactive Connection Establishment).
The explanations show that the requirements for the infrastructure of a WebRTC community are relatively high if one wants more then just experimenting with the technology. Additionally to the Web server for the Web application a signaling, a STUN and a TURN server are necessary. A complete WebRTC infrastructure is shown in the following picture:
It is possible to combine STUN and TURN into one server or to run them separated depending on scalability requirements. If a bridge into the VoIP net is required additional infrastructure components like media and protocol transcoder gateways must be added to the list. Because of the relatively high complexity of a self hosted WebRTC application currently a certain growth of WebRTC providers may be seen which provide that infrastructure and offer WebRTC as Saas/IaaS (Software / Infrastructure as a Services) business models.
WebRTC works best with the newest versions of Google Chrome, Mozilla Firefox and Opera. With the current dynamics of the development one should use the browsers always with the newest versions available. With Google Chrome it is possible to establish WebRTC connections to mobile Android devices. The Microsoft Internet Explorer does not yet support WebRTC. At least under Windows that is not a serious obstacle for the introduction of WebRTC because there are sufficient alternatives. The WebRTC connection to iOS based devices is currenty very limited. There are providers like TokBox which offer native iOS Apps with WebRTC support. Maybe a further brand new alternative is Ericssons OpenWebRTC initiative and the belonging browser "bowser". Apples Safari browser does not support WebRTC.
For a first test whether the own platform is fit for WebRTC one can use the Google reference application:
It uses the WebRTC signaling and NAT-traversal infrastructure of Google.
A cross-browser communication for example between Chrome and Firefox works principally. But the behaviour of the browsers is different in some details which may obstruct high quality conversations. For the communication between the newest versions of Chrome and Firefox the author experienced severe quality problems which do not happen in homogeneous Chrome to Chrome sessions. There have been increasing disturbances over the time. The latency increases, you may have audio drop outs and jerking video. To avoid disappointment when starting with that fascinating technology one should still avoid cross-browser sessions if possible. With the advance of the standard that situation will hopefully get better.
Despite of the incomplete standardization process and of political disputes of the big players WebRTC achieved the necessary impulse to become a new mass medium - 4 years after the project got started. It offers excellent customization possibilities which will result in lots of new use cases over the time. Missing big players like Microsoft seem to jump on the bandwagon which will drive the progress further. Even the blocking of the H.264 standard as alternative codec seems to get its first fissures. But even without that the potential is very high.
The number of WebRTC infrastructure and framework providers grows. Despite the still necessary adaptions to changes in the standardization process a big bunch of Web applications with integrated WebRTC functionality already exists. The risks of investment which one always has with the introduction of a new technology go down. Beyond the common hype and marketing euphoria phase the first added values of that technology in Web applications start to manifest.
As shown the potential application scenarios of WebRTC cover a broad range. The combination of media and other real-time data with classical Web applications and their workflows will create new potentials for customer retention, acquisition and sales processes. The old dream of savings and usage effects due to the replacement of physical presence by a remote live media presence is pushed by WebRTC to a great extent.
The right moment to get onboard with that new technology. Our expertise may help you with planning,
workflow analyses, infrastructure installation and operation and software-integration.