Exploring Media Formats

This tutorial explores background concepts and useful tips for making the best use of the Dolby Media Processing APIs in relation to media formats.

Introduction to Media Formats

Whether your application is dealing with archive media or user-generated content you may encounter a range of media formats. Digital media takes the form of a large binary file in a wrapper called a container. There is a wide range of containers and standards for how audio and video is encoded.

Let's start by looking at your own media.

Analyze API

The Analyze API can be used to get information about the media format including details on the container, audio, and video components.

This is a partial example of the data returned for a sample media file:

"result": {
    "media_info": {
        "audio": {
            "bitrate": 256001,
            "channel_order": "L R",
            "channels": 2,
            "codec": "aac",
            "duration": 97.685,
            "sample_rate": 48000
        "container": {
            "bitrate": 1651700,
            "duration": 97.685,
            "kind": "mp4",
            "size": 20168287
        "video": {
            "bitrate": 1391583,
            "codec": "h264",
            "duration": 97.685,
            "frame_rate": 25,
            "height": 720,
            "width": 1280

Let's see what some of that data means and what you might do with it.

Container Information

The container information includes details about the container type. This can be informative for a few reasons. Knowing the kind of container can help identify how metadata is stored within the file such as artist, track name, etc. Knowing the kind of container can help identify whether a piece of media can be played or ingested as part of a media workflow process or other transcoding service.

The following information about the container is determined:

  • kind: a description of the type of media container
  • duration: the media duration (in seconds)
  • bitrate: media bitrate (in bits per second)
  • size: media size (in bytes)

Audio Information

The audio information includes details about the audio stream. Similar to the container type, the audio type is important to understand compatibility of a file with any downstream processing your application might require such as transcoding.

You can also use these details as an indicator of the quality of the audio in a file if a lot of data compression has been used. The size and duration are also useful indicators for a user interface to help end-users understand if media has been truncated or lost from the creators intent.

If the duration of the audio and video are significantly different this can also be an issue. The number of channels is useful to know if channels are missing, or if a multichannel audio is available.

The following information about the audio stream is determined:

  • duration: audio duration (in seconds)
  • codec: identifier for the compression codec
  • sample_rate: audio sample-rate, generally 44100 or 48000
  • bitrate - audio bitrate (in bits per second)
  • channels: the number of channels, generally mono (1) or stereo (2)
  • channel_order: a comma separated list of channel identifiers if there are more than two channels
  • bit_depth: generally 16, 24, or 32-bit

Video Information

The video information includes details about the video stream. Similar to the audio and container type, the video type is important to understand compatibility of a file with downstream processing, transcoding, and delivery. For audio files, the video information may not be returned.

The video details can provide hints to the quality of the video contained. For example, a lot of data compression could cause distortion and other undesirable effects. Presenting a user interface with the duration and file size are a good indicator of a successful delivery.

The following information about the video stream is determined:

  • codec: identifies the compression codec
  • frame_rate: the number of frames per second (as a floating-point number)
  • height: picture height (in pixels)
  • width: picture width (in pixels)
  • duration: video duration (in seconds)
  • bitrate: video bitrate (in bits per second)