Analyze API

Guide to Using the Media Analyze API

Media Analyze API

The Media Analyze API takes your media and delivers insights into the audio quality as a time series of processed regions of your media.

Key features:

✓ General media Info
✓ Clipping sections
✓ Loudness (time-series)
✓ Bandwidth
✓ Signal-to-noise ratio
✓ Content classification (time-series)
✓ Musical key, instrument, and genre identification

Start building

Why use Media Analyze API?

Do you need to determine:

  • what type of media is in a collection?
  • will media platforms accept my media?
  • is there clipping or other audio noise artifacts?
  • how much of the media is speech, music, or silence?

Samples

Usage

See the Analyze API reference for more detailed explanations on these values.

Media info

The media_info section gives you details about the container and codec. See the Media File Formats for more explanation on these values.

        "media_info": {
            "container": {
                "kind": "mp4",
                "duration": 10801.645,
                "bitrate": 79674,
                "size": 107575636
            },
            "audio": {
                "codec": "aac",
                "channels": 2,
                "channel_order": "L R",
                "sample_rate": 44100,
                "duration": 10801.621223993765,
                "bitrate": 78286
            }
        }

Clipping

The clipping section alerts you to any clipping in the file. See the Clipping audio guide for more explanation on how to interpret these results.

"clipping": {
    "num_sections": 0,
    "sections": []
}

Loudness

The loudness section gives you details about the loudness of the media. See the Loudness audio guide for more explanation on how to interpret these results.

"loudness": {
    "measured": -15.27,
    "range": 4.31,
    "gating_mode": "speech",
    "sample_peak": -0.0,
    "true_peak": 0.07,
    "time_series": [
                    [
                        0.0,
                        -120.0,
                        -4.23,
                        -4.22
                    ],
                    [
                        1.0,
                        -120.0,
                        -8.06,
                        -7.95
                    ],
                    ...
    ]

Bandwidth and noise

See the Noise audio guide for more explanation on how to interpret these results.

            "bandwidth": 11197,
            "noise": {
                "snr_average": 82.42,
                "level_average": -101.87
            },

Content classification

The silence, and speech blocks help give context to the media file and the type of media it is.

            "speech": {
                "percentage": 94.0,
                "num_sections": 149,
                "sections": [
                    {
                        "section_id": "sp_1",
                        "start": 0.0,
                        "duration": 150.19
                    },
                    {
                        "section_id": "sp_2",
                        "start": 157.74,
                        "duration": 126.29
                    },
                    {
                        "section_id": "sp_3",
                        "start": 286.04,
                        "duration": 61.65
                    },
                   ...
            },
            "silence": {
                "percentage": 1.64,
                "num_sections": 56,
                "sections": [
                    {
                        "section_id": "si_1",
                        "start": 734.92,
                        "duration": 2.1,
                        "channels": [
                            "ch_0",
                            "ch_1"
                        ]
                    },
                    {
                        "section_id": "si_2",
                        "start": 813.98,
                        "duration": 2.12,
                        "channels": [
                            "ch_0",
                            "ch_1"
                        ]
                    },
             ...
       }

Music

The music section helps identify the sections that have music, but also an identification of key, genre, and instrument detected with a confidence score.

            "music": {
                "percentage": 34.79,
                "num_sections": 35,
                "sections": [
                    {
                        "section_id": "mu_1",
                        "start": 0.0,
                        "duration": 13.44,
                        "loudness": -16.56,
                        "bpm": 222.22,
                        "key": [
                            [
                                "Ab major",
                                0.72
                            ]
                        ],
                        "genre": [
                            [
                                "hip-hop",
                                0.17
                            ],
                            [
                                "rock",
                                0.15
                            ],
                            [
                                "punk",
                                0.13
                            ]
                        ],
                        "instrument": [
                            [
                                "vocals",
                                0.17
                            ],
                            [
                                "guitar",
                                0.2
                            ],
                            [
                                "drums",
                                0.05
                            ],
                            [
                                "piano",
                                0.04
                            ]
                        ]
                    },

We would like to recommend our blog section, which focuses on the Analyze API, which contains various informative articles detailing how our API can assist you in analyzing your audio and video content.