Dolby Voice

Dolby Voice on the Dolby.io Communications APIs Platform

Dolby Voice is an award-winning audio communications technology. With Dolby Voice, Dolby has applied its expertise in sight and sound signal processing and compression technologies to provide improvements in voice quality and clarity that make virtual meetings more natural and productive. Starting with the SDK 3.0 release, Dolby Voice is available for Communications APIs customers.

The benefits of Dolby Voice are:

  • Advanced audio processing features such as:
    • Dynamic audio leveling
    • Advanced spatial audio
    • Noise and echo reduction
  • Optimized bandwidth utilization
  • Advanced network resilience to help maintain good audio quality in challenging network conditions

Additionally, Dolby Voice allows using the Dolby Voice Codec (DVC), which offers the following benefits:

  • Improved audio processing
  • Improved audio quality
  • Increased conference capacity from 50 users to 250 users in audio-only conferences.

The following table presents the codec support for each SDK in the Dolby Voice mode:

Client Platform

Supported codec

Web SDK 3.5 and later

DVC*

Web SDK 3.4 and earlier

Opus

Desktop SDK

DVC

Android SDK 3.0 and later

DVC

iOS SDK 3.0 and later

DVC

React Native SDK

DVC

C++ SDK 1.0

Opus

C++ SDK 2.0

DVC**

* Supported only on Chrome and Edge on desktop operating systems. On other browsers and mobile operating systems, the SDK uses Opus.

** Supported only on Apple macOS and Microsoft Windows. On Linux, the SDK uses Opus.

For more information on SDK support and the deprecation schedule for each SDK version, see the SDK Support article.

By default, SDK 3.3 and later creates Dolby Voice conferences. To create a non-Dolby Voice conference, set a dolbyVoice parameter to false while creating a new conference.

Dolby Voice vs. Non-Dolby Voice modes

With Dolby Voice, the Communications APIs platform introduces a new way of managing audio streams between clients and the platform. The server mixes all received audio streams and transmits only one audio stream to each participant. The platform continues to support SDK 2.x clients with unmixed audio streams for backward compatibility. Customers using SDK 3.0 and later also have the flexibility to continue to create the conference in the traditional audio processing mode. SDK 2.x clients can participate in conferences created using SDK 3.0 and later only when Dolby Voice is disabled. The following diagram outlines the differences in communication between Dolby Voice and non-Dolby Voice mode:

57795779

This graphic illustrates the difference between a non-Dolby Voice and Dolby Voice conference

The following table shows the difference in audio streams transmission between non-Dolby Voice mode and Dolby Voice mode:

Client Platform Direction Non-Dolby Voice conference Dolby Voice conference
Web SDK 3.5 and later Uplink Single stereo Opus stream* Single mono Dolby Voice Codec (DVC) stream**
Downlink Multiple stereo Opus streams* Single multi-channel Dolby Voice Codec (DVC) stream**
Web SDK 3.4 and earlier
C++ SDK 1.0
Uplink Single stereo Opus stream* Single mono Opus stream
Downlink Multiple stereo Opus streams* Single stereo Opus stream
Desktop SDK Uplink - Single mono Dolby Voice Codec (DVC) stream
Downlink - Single multi-channel Dolby Voice Codec (DVC) stream
Android SDK
iOS SDK
React Native SDK
Uplink Single stereo Opus stream Single mono Dolby Voice Codec (DVC) stream
Downlink Multiple stereo Opus streams Single multi-channel Dolby Voice Codec (DVC) stream
C++ SDK 2.0 and later Uplink Single stereo Opus stream Single mono Dolby Voice Codec (DVC) stream***
Downlink Multiple stereo Opus streams Single multi-channel Dolby Voice Codec (DVC) stream***

* Stereo by default. The stream can be updated to mono using JoinOptions.
** Supported only on Chrome and Edge on desktop operating systems. On other browsers and mobile operating systems, the SDK uses a single mono Opus stream for uplink transmission and a single stereo Opus stream for downlink transmission.
*** Supported only on Apple macOS and Microsoft Windows. On Linux, the SDK uses a single mono Opus stream for uplink transmission and a single stereo Opus stream for downlink transmission.

Dolby Voice currently does not support spatial capture on any of the client platforms.

Dolby Voice audio processing

By default, the Dolby Voice audio processing algorithm is enabled for Dolby Voice conferences. Dolby Voice is optimized for voice communication and may have degraded behavior with non-voice audio, such as music. The SDK provides a Web API to disable audio processing in the event that you have background audio or music that needs to be passed through to the conference.

The audioProcessing API includes the AudioProcessingOptions and AudioProcessingSenderOptions, which allow participants to enable and disable audio processing.

APIs not supported with Dolby Voice conference

Due to a different audio stream transmission in Dolby Voice, the mute API is no longer supported for remote participants when the client connects to a Dolby Voice conference. In SDK 3.2 and later releases, muting all participants is supported in all conferences via the stopAudio API. This API allows the conference participants to stop receiving specific audio streams from the server.

Additionally, in Web SDK 3.4 and prior releases, audioLevel is not supported for remote participants when the client connects to a Dolby Voice conference.

The local participant can no longer call the deprecated APIs. An Unsupported exception is raised when the APIs are called on a remote participant in a Dolby Voice conference. If your application relies on one of the above functionalities, you can still upgrade to SDK 3.0, but create a non-Dolby Voice conference to use the APIs.

Webhook events

The Recording.Audio.Available webhook event is available for conferences enabled with Dolby Voice. The Client SDK sends the Recording.Audio.Available event when the conference recording in MP3 format is available for download at the specified URL. The Recording.MP4.Available webhook event continues to work as before for conferences not enabled with Dolby Voice Client.

The splits element within the new Recording.Audio.Available webhook event includes additional metadata, such as:

  • startTime: The time when the split recording started, in milliseconds since epoch.
  • duration: The duration of the split recording.
  • size: The size of the split recording.

Note: Split recording is only supported for Dolby Voice-enabled conferences.

Quality indicator events

For SDK 2.4, the client generates qualityIndicators events for both audio and video. The server no longer generates quality indicator events due to load issues introduced for the platform.

For SDK 3.0 and later, for conferences enabled with Dolby Voice, the server distributes an audio MOS score collected from the Dolby Voice Conferencing Server for audio participants. The client maps the MOS score to the quality indicator. For Opus clients connecting to the Dolby Voice conference, as the audio is mixed and the server does not generate a MOS score for such clients, the Opus client will no longer have an audio quality score. Opus clients will continue to receive the participant's video quality score if video is enabled.

Stream events

In Dolby Voice conferences, each conference participant receives only one mixed audio stream from the server. To keep track of the state of the remote participant's audio, the SDK provides a fake audio stream. This fake stream can be observed in the media stream returned in the streamAdded, streamUpdated, and streamRemoved events. There are situations in which the developer should ignore specific stream events that contain this fake stream. Our sample applications have been updated to show the proper way to handle these events. Please refer to this sample for more details.

Limitations

After the SDK 3.0 integration:

  • iOS SDK is no longer delivered with bitcode enabled.
  • Web SDK no longer offers audio Mean Opinion Scores (MOS) for clients connected to Dolby Voice conferences.

After the SDK 3.4 integration on iOS SDK:

  • System level issue in iOS 14 to iOS 15.3 results rendering mono output audio on iOS platform for Dolby Voice and non-Dolby Voice conferences. Spatial conference will be down-mixed and rendered in mono audio on the affected device only.
  • Since iOS 15.4, stereo output audio is available on enabled spatial Dolby Voice conference and in listen only mode.

SDK distribution

The Android SDK 3.0 and later uses a Voxeet AWS S3 repository for storing the files. Download the Android SDK 3.0 and later on GitHub. The previous versions of the Android SDK are still accessible through bintray.


Did this page help you?