Dolby Voice

Dolby Voice coming to the Interactivity API Platform

Dolby Voice is an award-winning audio communications technology. With Dolby Voice, Dolby has applied its expertise in sight and sound signal processing and compression technologies to provide improvements in voice quality and clarity that make virtual meetings more natural and productive. Starting with the SDK 3.0 release, Dolby Voice is available for Interactivity API customers.

This guide describes the major features that are new in version 3.0 of the Dolby Interactivity APIs Client SDK, and provides guidance on limitations and how to migrate your applications to use SDK 3.0 with Dolby Voice integration. Upgrade to SDK 3.0 is designed for backwards compatibility for all conference participant types regardless of platform.

The benefits of Dolby Voice are:

  • Advanced audio processing features such as:

    • Dynamic audio leveling
    • Advanced spatial audio
    • Noise and echo reduction
  • Optimized bandwidth utilization
  • Advanced network resilience to help maintain good audio quality in challenging network conditions

For more information on SDK support and the deprecation schedule for SDK 2.x, see the SDK support article.

Note: This migration does not provide a comprehensive list of all the changes, but outlines the most important changes in this release. For more information, see the release notes.

Dolby Voice vs. Non-Dolby Voice modes

With Dolby Voice, the Interactivity API platform introduces a new way of managing audio streams between clients and the platform. The server now mixes all received audio streams and transmits only one audio stream to each participant. The platform continues to support SDK 2.x clients with unmixed audio streams for backward compatibility. Customers using SDK 3.0 also have the flexibility to continue to create the conference in the traditional audio processing mode. SDK 2.x clients can participate in conferences created using SDK 3.0 only when Dolby Voice is disabled. The following diagram outlines the differences in communication between Dolby Voice and non-Dolby Voice mode:

Audio streams map

The following table shows the audio codec difference between non-Dolby Voice mode and Dolby Voice mode:

Client Platform Direction Customer applications using SDK 2.x or SDK 3.0 connecting to a non-Dolby Voice conference Customer applications using SDK 3.0 connecting to a Dolby Voice conference
Web Uplink One mono Opus stream Single mono Opus stream
Downlink Multiple mono Opus streams Single Stereo Opus stream
Mobile native (iOS/Android) Uplink One mono Opus stream Single mono Dolby Voice Codec (DVC) stream
Downlink Multiple mono Opus streams Single multi-channel Dolby Voice Codec (DVC) stream

Dolby Voice currently does not support spatial capture on any of the client platforms.

Creating a Dolby Voice conference

By default, with SDK 3.0, the create method creates a non-Dolby Voice conference and there are minimal upgrade requirements for developers, other than updating the SDK. Use the new dolbyVoice conference parameter to create Dolby Voice conferences.

  • JavaScript
  • Swift
  • Java

ConferenceParameters model now contains a new dolbyVoice field that indicates whether the application wishes to create a conference with Dolby Voice enabled. By default the field is set to false.

VoxeetSDK.conference.create({
  alias: alias,
  params: {
    dolbyVoice: true,
  },
})

VTConferenceOptions model now contains a new dolbyVoice field that indicates whether the application wishes to create a conference with Dolby Voice enabled. By default the field is set to false.

let options = VTConferenceOptions()
options.params.dolbyVoice = true

VoxeetSDK.shared.conference.create(options: options, success: { conference in }, fail: { error in })

ConferenceCreateOptions.Builder model now contains a new addParam method with a dolbyVoice filed that indicates whether the application wishes to create a conference with Dolby Voice enabled. By default the field is set to false.

VoxeetSDK.conference().create(
  new ConferenceCreateOptions.Builder()
    .setConferenceAlias(conference_alias)
    .addParam("dolbyVoice", true).build()
).then(createConferenceresult -> {
  //manage the success here
}).error(error -> {
  //manage the error here
});

Dolby Voice audio processing

By default, the Dolby Voice audio processing algorithm is enabled for Dolby Voice conferences. Dolby Voice is optimized for voice communication and may have degraded behavior with non-voice audio, such as music. SDK 3.0 provides a Web API to disable audio processing in the event that you have background audio or music that needs to be passed through to the conference.

The audioProcessing API includes the AudioProcessingOptions and AudioProcessingSenderOptions, which allow participants to enable and disable audio processing.

APIs not supported with Dolby Voice conference

Due to a different audio stream transmission in Dolby Voice, the startAudio, stopAudio, isMuted, mute, and audioLevel APIs are no longer supported for remote participants when the client connects to a Dolby Voice conference. The following tables list the support of the mentioned APIs for local and remotes participants in Dolby Voice and non-Dolby Voice conferences:

Table: non-Dolby Voice conferences
API Web SDK Android and iOS SDK
Local participant Remote participants Local participant Remote participants
startAudio
stopAudio
isMuted - -
mute
audioLevel

Table: Dolby Voice conferences
API Web SDK Android and iOS SDK
Local participant Remote participants Local participant Remote participants
startAudio - - -
stopAudio - - -
isMuted - -
mute - -
audioLevel -

A local participant can no longer call the deprecated APIs. An Unsupported exception is raised when the APIs are called on a remote participant in a Dolby Voice conference. If your application relies on one of the above functionalities, you can still upgrade to SDK 3.0, but create a non-Dolby Voice conference to use the APIs.

New webhook events

SDK 3.0 introduces a new Recording.Audio.Available webhook event for conferences enabled with Dolby Voice. The Client SDK sends the Recording.Audio.Available event when the conference recording in MP3 format is available for download at the specified URL. The Recording.MP4.Available webhook event continues to work as before for conferences not enabled with Dolby Voice Client.

The splits element within the new Recording.Audio.Available webhook event includes additional metadata, such as:

  • startTime: The time when the split recording started, in milliseconds since epoch.
  • duration: The duration of the split recording.
  • size: The size of the split recording.

Changes to quality indicator events

For SDK 2.4, the client generates qualityIndicators events for both audio and video. The server no longer generates quality indicator events due to load issues introduced for the platform.

For SDK 3.0, for conferences enabled with Dolby Voice, the server distributes an audio MOS score collected from the Dolby Voice Conferencing Server for audio participants. The client maps the MOS score to the quality indicator. For Opus clients connecting to the Dolby Voice conference, as the audio is mixed and the server does not generate a MOS score for such clients, the Opus client will no longer have an audio quality score. Opus clients will continue to receive the participant's video quality score if video is enabled.

Changes to stream events

In Dolby Voice conferences, each conference participant receives only one mixed audio stream from the server. To keep backward compatibility with the customers' implementation, SDK 3.0 introduces a faked audio track for audio transmission. The faked audio track is included in the streamAdded and streamRemoved events. The SDK 3.0 takes the audio stream information from the participantAdded and participantUpdated events.

Limitations

After the SDK 3.0 integration:

  • iOS SDK is no longer delivered with bitcode enabled.
  • Web SDK no longer offers audio Mean Opinion Scores (MOS) for clients connected to Dolby Voice conferences.

Changes to SDK distribution

The Android SDK 3.0 uses a new Voxeet AWC S3 repository for storing the files. To download the Android SDK 3.0, use the https://android-sdk.voxeet.com/ link. The previous versions of the Android SDK are still accessible through bintray.

Troubleshooting Dolby Voice integration

For troubleshooting audio issues with Dolby Voice, the Dolby Voice Client provides an API to generate a state dump. The state dump is a collection of files containing information about the state of the Dolby Voice Client library, including client log information, event log information, and log messages. State dump logs can be provided to Dolby Support for troubleshooting purposes.