Dolby.io Media Processing APIs were created to make your audio better, at scale. To deliver great audio, you need the tools and workflow automation to address your growing content libraries and meet consumer expectations. Whether your content is noisy, isn’t the right volume, or just doesn’t “feel right”, we’re here to help. Our APIs analyze your audio, figure out how to optimally enhance it, and apply just the right amount of processing to give you a professional, natural sound.
Do you need to:
- make your audio sound better without being an audio expert?
- make listening easier by removing background noise and boosting speech?
- improve speech quality by balancing tone and removing harshness?
- conform your media to meet broadcasting platform recommendations and standards?
These sorts of requirements can be achieved with the Media Enhance API. If you are building an application where users expect high quality sonic clarity, this API is a valuable tool to use in web and mobile applications at cloud-scale.
See the Quick Start to Enhancing Media to get started.
Do you need to know:
- how to measure for overall audio quality?
- if there are specific problems such as silent channels?
- general media information such as framerate and bitrate?
- the percentage of speech or music?
Those sorts of questions and more can be answered with the Media Diagnose API. If you want to a quick summary of your media including insight into potential problems then try the Diagnose API.
See the Quick Start to Diagnosing Media to get started.
Do you need to know:
- what type of media do I have?
- will media platforms accept my media?
- is there clipping or other audio noise artifacts?
- how much of the media is speech, music, or silence?
These sorts of questions and more can be answered with the Media Analyze API. If you are building an application that accepts user-generated content or just have a large collection of media to understand at cloud-scale, this API is a valuable tool to have.
See the Quick Start to Analyzing Media to get started.
Do you need to know:
- the number of talkers in your media and when they are talking?
- loudness of each talker so that you can loudness correct if needed?
- quality score of each talker to identify if a talker's setup has a systemic problem?
- useful talker metrics like talk-listen-ratio?
These sorts of questions can be answered by the Speech Analytics API. This API is focused on Speech Analytics and is a valuable tool to get insights about the speech in your media. The Speech Analytics API is a specialized API targeted towards media with dominant speech content. The Media Analyze API may be used to identify if your media has dominant speech before calling the Speech Analytics API
See the Quick Start to Analyzing Speech to get started.
The Media Processing APIs typically follow a similar pattern.
- Use your API keys to authenticate
- Ensure your media is shared to be readable and writable by our services
- Make an asynchronous API call
Let's look at these steps.
To use the Media Processing APIs, you need to authorize your application's requests. There are two approaches for authentication:
- API Key Authentication
- OAuth Bearer Token Authentication
See the Authentication guide for additional detail on these methods.
To process your media, we need to be able to read and write it. There are a number of ways to achieve this:
- Use your existing cloud storage such as AWS or GCP
- Use our temporary cloud storage
See the Media Input and Output guide for examples of how to do this.
To process media, you first start a processing job and then need to wait for that processing job to complete. There are two approaches to handling this within your applications.
This approach has a few steps:
POSTto a media endpoint to start processing
GETto the same endpoint to check progress
- Repeat step 2 until the job is complete
This is a common pattern called polling where
GET requests are repeated while waiting on the returned status. The expected status values include:
- Success - this status indicates the result is ready
- Running - your media is being processed, check back again soon
- Pending - your media is waiting for an available resource to run it
- Failed - there was a problem and you'll see an error with some additional notes about what the cause might be
You can run this as frequently as desired to check on the status of the job and inspect the progress value.
As an alternative to polling, you can receive a notification when a job is complete. This can be specified at the time of submission as a one-time callback or registered with the platform as a webhook to be fired for every job.
The Webhooks and Callbacks platform guide provides additional details about how to setup and receive these notifications.
The more advanced algorithms used in some cases need some time to work their magic.
For example, in the case of the Media Enhance API the processing time may be more or less than the duration of the media itself depending on its length:
- a 60 second input file may take 80 seconds to complete
- a 5 minute input file may take 3 minutes and 30 seconds to complete
Our researchers are always working to make processing algorithms more efficient and decrease processing time overall.
If all goes well you will get a status of Success.
Each Media Processing API may return results in a different way. Some examples:
You should check the API Reference for any services you intend to call to understand how results are returned.
Updated 29 days ago