Dolby Voice is an award-winning audio communications technology. With Dolby Voice, Dolby has applied its expertise in sight and sound signal processing and compression technologies to provide improvements in voice quality and clarity that make virtual meetings more natural and productive. Starting with the SDK 3.0 release, Dolby Voice is available for Communications APIs customers.
The benefits of Dolby Voice are:
- Advanced audio processing features such as:
- Dynamic audio leveling
- Advanced spatial audio
- Noise and echo reduction
- Optimized bandwidth utilization
- Advanced network resilience to help maintain good audio quality in challenging network conditions
Additionally, Dolby Voice allows using the Dolby Voice Codec (DVC), which offers the following benefits:
- Improved audio processing
- Improved audio quality
- Increased conference capacity from 50 users to 250 users in audio-only conferences.
The following table presents the codec support for each SDK in the Dolby Voice mode:
Web SDK 3.5 and later
Web SDK 3.4 and earlier
Android SDK 3.0 and later
iOS SDK 3.0 and later
React Native SDK
C++ SDK 1.0
C++ SDK 2.0
* Supported only on Chrome and Edge on desktop operating systems. On other browsers and mobile operating systems, the SDK uses Opus.
** Supported only on Apple macOS and Microsoft Windows. On Linux, the SDK uses Opus.
For more information on SDK support and the deprecation schedule for each SDK version, see the SDK Support article.
By default, SDK 3.3 and later creates Dolby Voice conferences. To create a non-Dolby Voice conference, set a dolbyVoice parameter to false while creating a new conference.
With Dolby Voice, the Communications APIs platform introduces a new way of managing audio streams between clients and the platform. The server mixes all received audio streams and transmits only one audio stream to each participant. The platform continues to support SDK 2.x clients with unmixed audio streams for backward compatibility. Customers using SDK 3.0 and later also have the flexibility to continue to create the conference in the traditional audio processing mode. SDK 2.x clients can participate in conferences created using SDK 3.0 and later only when Dolby Voice is disabled. The following diagram outlines the differences in communication between Dolby Voice and non-Dolby Voice mode:
The following table shows the difference in audio streams transmission between non-Dolby Voice mode and Dolby Voice mode:
|Client Platform||Direction||Non-Dolby Voice conference||Dolby Voice conference|
|Web SDK 3.5 and later||Uplink||Single stereo Opus stream*||Single mono Dolby Voice Codec (DVC) stream**|
|Downlink||Multiple stereo Opus streams*||Single multi-channel Dolby Voice Codec (DVC) stream**|
|Web SDK 3.4 and earlier
C++ SDK 1.0
|Uplink||Single stereo Opus stream*||Single mono Opus stream|
|Downlink||Multiple stereo Opus streams*||Single stereo Opus stream|
|Desktop SDK||Uplink||-||Single mono Dolby Voice Codec (DVC) stream|
|Downlink||-||Single multi-channel Dolby Voice Codec (DVC) stream|
React Native SDK
|Uplink||Single stereo Opus stream||Single mono Dolby Voice Codec (DVC) stream|
|Downlink||Multiple stereo Opus streams||Single multi-channel Dolby Voice Codec (DVC) stream|
|C++ SDK 2.0 and later||Uplink||Single stereo Opus stream||Single mono Dolby Voice Codec (DVC) stream***|
|Downlink||Multiple stereo Opus streams||Single multi-channel Dolby Voice Codec (DVC) stream***|
* Stereo by default. The stream can be updated to mono using JoinOptions.
** Supported only on Chrome and Edge on desktop operating systems. On other browsers and mobile operating systems, the SDK uses a single mono Opus stream for uplink transmission and a single stereo Opus stream for downlink transmission.
*** Supported only on Apple macOS and Microsoft Windows. On Linux, the SDK uses a single mono Opus stream for uplink transmission and a single stereo Opus stream for downlink transmission.
Dolby Voice currently does not support spatial capture on any of the client platforms.
By default, the Dolby Voice audio processing algorithm is enabled for Dolby Voice conferences. Dolby Voice is optimized for voice communication and may have degraded behavior with non-voice audio, such as music. The SDK provides a Web API to disable audio processing in the event that you have background audio or music that needs to be passed through to the conference.
Due to a different audio stream transmission in Dolby Voice, the mute API is no longer supported for remote participants when the client connects to a Dolby Voice conference. In SDK 3.2 and later releases, muting all participants is supported in all conferences via the stopAudio API. This API allows the conference participants to stop receiving specific audio streams from the server.
Additionally, in Web SDK 3.4 and prior releases, audioLevel is not supported for remote participants when the client connects to a Dolby Voice conference.
The local participant can no longer call the deprecated APIs. An Unsupported exception is raised when the APIs are called on a remote participant in a Dolby Voice conference. If your application relies on one of the above functionalities, you can still upgrade to SDK 3.0, but create a non-Dolby Voice conference to use the APIs.
The Recording.Audio.Available webhook event is available for conferences enabled with Dolby Voice. The Client SDK sends the Recording.Audio.Available event when the conference recording in MP3 format is available for download at the specified URL. The Recording.MP4.Available webhook event continues to work as before for conferences not enabled with Dolby Voice Client.
splits element within the new Recording.Audio.Available webhook event includes additional metadata, such as:
startTime: The time when the split recording started, in milliseconds since epoch.
duration: The duration of the split recording.
size: The size of the split recording.
Note: Split recording is only supported for Dolby Voice-enabled conferences.
For SDK 2.4, the client generates qualityIndicators events for both audio and video. The server no longer generates quality indicator events due to load issues introduced for the platform.
For SDK 3.0 and later, for conferences enabled with Dolby Voice, the server distributes an audio MOS score collected from the Dolby Voice Conferencing Server for audio participants. The client maps the MOS score to the quality indicator. For Opus clients connecting to the Dolby Voice conference, as the audio is mixed and the server does not generate a MOS score for such clients, the Opus client will no longer have an audio quality score. Opus clients will continue to receive the participant's video quality score if video is enabled.
In Dolby Voice conferences, each conference participant receives only one mixed audio stream from the server. To keep track of the state of the remote participant's audio, the SDK provides a fake audio stream. This fake stream can be observed in the media stream returned in the streamAdded, streamUpdated, and streamRemoved events. There are situations in which the developer should ignore specific stream events that contain this fake stream. Our sample applications have been updated to show the proper way to handle these events. Please refer to this sample for more details.
After the SDK 3.0 integration:
- iOS SDK is no longer delivered with bitcode enabled.
- Web SDK no longer offers audio Mean Opinion Scores (MOS) for clients connected to Dolby Voice conferences.
After the SDK 3.4 integration on iOS SDK:
- System level issue in iOS 14 to iOS 15.3 results rendering mono output audio on iOS platform for Dolby Voice and non-Dolby Voice conferences. Spatial conference will be down-mixed and rendered in mono audio on the affected device only.
- Since iOS 15.4, stereo output audio is available on enabled spatial Dolby Voice conference and in listen only mode.
The Android SDK 3.0 and later uses a Voxeet AWS S3 repository for storing the files. Download the Android SDK 3.0 and later on GitHub. The previous versions of the Android SDK are still accessible through bintray.
Updated 25 days ago