NEWDolby Interactivity APIs are now the Dolby.io Communications APIs Learn More >
X

Dolby Voice

Dolby Voice on the Dolby.io Communications APIs Platform

Dolby Voice is an award-winning audio communications technology. With Dolby Voice, Dolby has applied its expertise in sight and sound signal processing and compression technologies to provide improvements in voice quality and clarity that make virtual meetings more natural and productive. Starting with the SDK 3.0 release, Dolby Voice is available for Communications APIs customers.

This guide describes the major features that are new in version 3.0 of the Dolby.io Communications Client SDKs, and provides guidance on limitations and how to migrate your applications to use SDK 3.0 with Dolby Voice integration. Upgrade to SDK 3.0 is designed for backwards compatibility for all conference participant types regardless of platform.

The benefits of Dolby Voice are:

  • Advanced audio processing features such as:
    • Dynamic audio leveling
    • Advanced spatial audio
    • Noise and echo reduction
  • Optimized bandwidth utilization
  • Advanced network resilience to help maintain good audio quality in challenging network conditions

For more information on SDK support and the deprecation schedule for SDK 2.x, see the SDK Support article.

Note: This migration does not provide a comprehensive list of all the changes, but outlines the most important changes in this release. For more information, see the release notes.

Dolby Voice vs. Non-Dolby Voice modes

With Dolby Voice, the Communications APIs platform introduces a new way of managing audio streams between clients and the platform. The server now mixes all received audio streams and transmits only one audio stream to each participant. The platform continues to support SDK 2.x clients with unmixed audio streams for backward compatibility. Customers using SDK 3.0 also have the flexibility to continue to create the conference in the traditional audio processing mode. SDK 2.x clients can participate in conferences created using SDK 3.0 only when Dolby Voice is disabled. The following diagram outlines the differences in communication between Dolby Voice and non-Dolby Voice mode:

This graphic illustrates the difference between a non-Dolby Voice and Dolby Voice conferenceThis graphic illustrates the difference between a non-Dolby Voice and Dolby Voice conference

This graphic illustrates the difference between a non-Dolby Voice and Dolby Voice conference

The following table shows the audio codec difference between non-Dolby Voice mode and Dolby Voice mode:

Client Platform Direction Non-Dolby Voice conference Dolby Voice conference
Web Uplink Single stereo Opus stream* Single mono Opus stream
Downlink Multiple stereo Opus streams* Single stereo Opus stream
Native Desktop Uplink - Single mono Dolby Voice Codec (DVC) stream
Downlink - Single multi-channel Dolby Voice Codec (DVC) stream
Mobile native (iOS/Android) Uplink Single stereo Opus stream Single mono Dolby Voice Codec (DVC) stream
Downlink Multiple stereo Opus streams Single multi-channel Dolby Voice Codec (DVC) stream

*Stereo by default. The stream can be updated to mono using JoinOptions.

Dolby Voice currently does not support spatial capture on any of the client platforms.

Creating a Dolby Voice conference

By default, with SDK 3.0, the create method creates a non-Dolby Voice conference and there are minimal upgrade requirements for developers, other than updating the SDK. Use the new dolbyVoice conference parameter to create Dolby Voice conferences.

/*
  ConferenceParameters model now contains a new dolbyVoice field that 
  indicates whether the application wishes to create a conference with 
  Dolby Voice enabled. By default the field is set to false.
*/

await VoxeetSDK.conference.create({
 alias: alias,
 params: {
  dolbyVoice: true,
 },
});
// VTConferenceOptions model now contains a new dolbyVoice
// field that indicates whether the application wishes to create 
// a conference with Dolby Voice enabled. By default the field is 
// set to false.

let options = VTConferenceOptions()
options.params.dolbyVoice = true

VoxeetSDK.shared.conference.create(options: options, success: { conference in }, fail: { error in }
/*
  ParamsHolder model now contains a new setDolbyVoice
  method with a flag that indicates whether the application
  wishes to create a conference with Dolby Voice enabled. By default
  the flag is set to false.
*/

ParamsHolder paramsHolder = new ParamsHolder()
 .setDolbyVoice(true);
ConferenceCreateOptions conferenceCreateOptions = new ConferenceCreateOptions.Builder()
 .setConferenceAlias(conference_alias)
 .setParamsHolder(paramsHolder)
 .build();

VoxeetSDK.conference().create(conferenceCreateOptions)
 .then(conference -> {
   // manage the success here
 }).error(error -> {
   // manage the error here
 });

Dolby Voice audio processing

By default, the Dolby Voice audio processing algorithm is enabled for Dolby Voice conferences. Dolby Voice is optimized for voice communication and may have degraded behavior with non-voice audio, such as music. SDK 3.0 provides a Web API to disable audio processing in the event that you have background audio or music that needs to be passed through to the conference.

The audioProcessing API includes the AudioProcessingOptions and AudioProcessingSenderOptions, which allow participants to enable and disable audio processing.

APIs not supported with Dolby Voice conference

Due to a different audio stream transmission in Dolby Voice, the isMuted, mute, and audioLevel APIs are no longer supported for remote participants when the client connects to a Dolby Voice conference. The following tables list the support of the mentioned APIs for local and remotes participants in Dolby Voice and non-Dolby Voice conferences:

Table: non-Dolby Voice conferences
API Web SDK Android and iOS SDK
Local participant Remote participants Local participant Remote participants
isMuted - -
mute
audioLevel

Table: Dolby Voice conferences
API Web SDK Desktop SDK Android and iOS SDK
Local participant Remote participants Local participant Remote participants Local participant Remote participants
isMuted - - -
mute -* -* -*
audioLevel -

*If you wish to mute remote participants in Dolby Voice conferences, we recommend using the stopAudio API. This API allows the conference participants to stop receiving specific audio streams from the server.

Note: Starting with the SDK 3.2 release, the startAudio and stopAudio methods are fully supported in Dolby Voice conferences.

A local participant can no longer call the deprecated APIs. An Unsupported exception is raised when the APIs are called on a remote participant in a Dolby Voice conference. If your application relies on one of the above functionalities, you can still upgrade to SDK 3.0, but create a non-Dolby Voice conference to use the APIs.

New webhook events

SDK 3.0 introduces a new Recording.Audio.Available webhook event for conferences enabled with Dolby Voice. The Client SDK sends the Recording.Audio.Available event when the conference recording in MP3 format is available for download at the specified URL. The Recording.MP4.Available webhook event continues to work as before for conferences not enabled with Dolby Voice Client.

The splits element within the new Recording.Audio.Available webhook event includes additional metadata, such as:

  • startTime: The time when the split recording started, in milliseconds since epoch.
  • duration: The duration of the split recording.
  • size: The size of the split recording.

Note: Split recording is only supported for Dolby Voice-enabled conferences.

Changes to quality indicator events

For SDK 2.4, the client generates qualityIndicators events for both audio and video. The server no longer generates quality indicator events due to load issues introduced for the platform.

For SDK 3.0, for conferences enabled with Dolby Voice, the server distributes an audio MOS score collected from the Dolby Voice Conferencing Server for audio participants. The client maps the MOS score to the quality indicator. For Opus clients connecting to the Dolby Voice conference, as the audio is mixed and the server does not generate a MOS score for such clients, the Opus client will no longer have an audio quality score. Opus clients will continue to receive the participant's video quality score if video is enabled.

Changes to stream events

In Dolby Voice conferences, each conference participant receives only one mixed audio stream from the server. To keep track of the state of the remote participant's audio, SDK 3.0 introduces a fake audio stream. This fake stream can be observed in the media stream returned in the streamAdded, streamUpdated, and streamRemoved events. There are situations in which the developer should ignore specific stream events that contain this fake stream. Our sample applications have been updated to show the proper way to handle these events. Please refer to this sample for more details.

Limitations

After the SDK 3.0 integration:

  • iOS SDK is no longer delivered with bitcode enabled.
  • Web SDK no longer offers audio Mean Opinion Scores (MOS) for clients connected to Dolby Voice conferences.

Changes to SDK distribution

The Android SDK 3.0 uses a new Voxeet AWS S3 repository for storing the files. To download the Android SDK 3.0, use the https://github.com/voxeet/voxeet-sdk-android link. The previous versions of the Android SDK are still accessible through bintray.


Did this page help you?