September 18th, 2015
In October last year, we announced our intention to implement ORTC in Microsoft Edge with a focus on video and audio communications, and we have been hard at work on our implementation since then. Today, we are excited to announce that a preview version of our ORTC API implementation is available in the latest Windows Insider Preview release.
ORTC Support in Microsoft Edge is the result of a close collaboration between Microsoft’s Operating Systems Group and Skype teams. Together we bring over 20 years of web platform experience and over 12 years of expertise in building one of the largest and most reliable real-time communications service for consumers and business users. Our goal is to enable developers around the world to build experiences that include the ability to talk to Skype users and other WebRTC compatible communication services.
We look forward to the web community building more user scenarios enabled by ORTC. In support of that, we would like to share with you what we have provided in this preview version of our ORTC implementation, and the basic steps in building simple 1:1 audio and video communications.
What we are providing
The below diagram is part of the Overview section in the ORTC API spec. It provides a very high level summary of the relationships between ORTC objects, and a useful illustration of a code flow, from capture media stream tracks as input to the RtpSender objects, and all the way to the media stream tracks out of the RtpReceiver objects that can be rendered to video/audio tags. We recommend using the diagram as a reference when you start learning ORTC API.
Our initial ORTC implementation includes the following components:
- ORTC API Support. Our primary focus right now is audio/video communications. We have implemented the following objects: IceGatherer, IceTransport, DtlsTransport, RtpSender, RtpReceiver, as well as the RTCStats interfaces that are not shown directly in the diagram.
- RTP/RTCP multiplexing is supported and is required for use with DtlsTransport. A/V multiplexing is also supported.
- STUN/TURN/ICE support. We support STUN (RFC 5389), TURN (RFC 5766) as well as ICE (RFC 5245). Within ICE, regular nomination is supported, with aggressive nomination partially supported (as a receiver). DTLS-SRTP (RFC 5764) is supported, based on DTLS 1.0 (RFC 4347).
- Codec support. For audio codecs, we support G.711, G.722, Opus and SILK. We also support Comfort Noise (CN) and DTMF according to the RTCWEB audio requirements. For video we currently support the H.264UC codec used by Skype services, supporting advanced features such as simulcast, scalable video coding and forward error correction. We’re working toward to enabling interoperable video with H.264.
While the preview implementation may still have bugs, we believe it is ready to support typical scenarios, and are eager to get early hands-on feedback from developers in the Windows Insider Program.
If you are familiar with WebRTC 1.0 implementations, and are interested in the evolution of object support within WebRTC 1.0 and ORTC, we recommend the following presentation by Google, Microsoft, and Hookflash from the 2014 IIT RTC Conference: “ORTC API Update.”
App level code flow for 1:1 communications
Now let’s discuss how to move from the ORTC overview diagram to coding for a simple 1:1 audio/video communication scenario. For this specific scenario, you will need two Win10 machines acting as two peers, and a webserver as the signaling channel to exchange information between the peers so they can establish a connection between them.
The steps below are scoped to operations taken by one peer. Both peers need to go through similar steps in order to set up the 1:1 communications. To make better sense out of the code snippets below, we suggest that you refer to our Microsoft Edge Test Drive Demo as a reference.
Our example uses audio-video and RTP/RTCP multiplexing, so you will see only a single ICE and DTLS transport, used to transport RTP and RTCP packets for both audio and video.
Step #1. Create MediaStream object (i.e. Media Capture API) with one audio track and one video track
You can learn more about the Media Capture API in our recent post, Announcing media capture functionality in Microsoft Edge.
Step #2. Create the ICE gatherer, and enable local ICE candidates to be signaled to remote peer.
To help protect user privacy, we introduced an option that allows a user to control whether local host IP addresses can be exposed by the IceGatherer objects. We will provide an interface option in Microsoft Edge to toggle this setting.
In our Test Drive Demo we have set up a TURN server. The TURN server has a very limited throughput, so we are limiting that only to our demo page.
Step #3. Create the ICE transport for audio and video, and prepare to handle remote ICE candidates on the ICE transport
Another option here is to accumulate all the remote ICE candidates into an array
remoteCandidates and call
iceTr.setRemoteCandidates(remoteCandidates) to add all the remote candidates together.
Step #4. Create the DTLS transport
var dtlsTr = new RTCDtlsTransport(iceTr);
Step #5. Create the sender and receiver objects
Step #6. Retrieve the receiver and sender capabilities
Step #7. ICE/DTLS parameters and Send/Receive capabilities can be exchanged.
Step #8. Get remote params, start the ICE and DTLS transports, and set the audio and video send and receive parameters.
Here is a skeleton of the helper functions. You can find more details in our Test Drive Demo.
Step #9. Render and play the remote media stream tracks through a video tag
This describes the major steps in the code flow. Please refer to our Test Drive Demo for further details, which include more basic steps in order to set up video preview, and error messages, etc.
Once you are familiar with setting up 1:1 calls, it should be relatively straightforward to set up small group calls using a mesh topology, where each peer will have a 1:1 connection with rest of the group. Since parallel forking is not supported in our platform, this should be handled via 1:1 signaling, so that independent IceGatherer and DtlsTransport objects will be used for each connection.
More ORTC implementation details in Microsoft Edge
There are some limitations in our implementation which we would like to call out:
- We don’t support RTCIceTransportController. Our implementation handles ICE freezing/unfreezing on a per-transport basis, so that ordering all the IceTransports is not necessary. We believe that this approach should interoperate with existing implementations.
- RtpListener is not yet supported. This means that SSRCs need to be specified in advance within the RtpReceiver.
- Forking is not yet supported in either IceTransport, IceGatherer, or DtlsTransport. The solution to DtlsTransport forking is still under discussion in the ORTC CG.
- RTP/RTCP non-mux with DtlsTransport is not yet supported. When using DtlsTransport your application will need to support RTP/RTCP mux.
- In RTCRtpEncodingParameters, we currently ignore most of the encoding quality controls. However, we do require setting of the ‘active’ and ‘ssrc’ attributes.
- The icecandidatepairchanged event is not yet supported. You should be able to extract the candidate pair information through the getNominatedCandidatePair method.
- We don’t currently support any of the DataChannel functionality currently defined in the ORTC spec.
While our implementation is still early, we’re eager to get your feedback as we work to deliver the full feature set in in Microsoft Edge in the coming months. Our goal is to build an implementation that is interoperable across the web today as well as with the real-time communications industry in the long term.
Towards that goal, in the very near future, the Skype team will leverage the ORTC APIs in Microsoft Edge to enable rich, full-fidelity communication and collaboration experiences, starting with voice and video in the Skype and Skype for Business Web clients. The team also is investing in standards-based WebRTC protocol interoperability, to ensure Skype experiences can run across major desktop, mobile and browser platforms. For developers who want to integrate Skype and Skype for Business experiences into the applications you build, the Skype Web SDK will also leverage ORTC and WebRTC APIs so that your applications will benefit from the same cross platform and browser support. For more information, please check out Enabling Seamless Communication Experiences for the Web with Skype, Skype for Business, and Microsoft Edge.
Furthermore, several members of the ORTC CG have been working closely with us as early technology adopters. We plan to work together to contribute to to the evolution of WebRTC technology toward “WebRTC Next Version (NV).” We expect that early ORTC adopters will be able to share their perspectives soon – stay tuned!
– Shijun Sun, Principal Program Manager, Microsoft Edge
– Hao Yan, Principal Program Manager, Skype
– Bernard Aboba, Architect, Skype