developer
media
Recording Audio on Android with Examples
Megan Ren

Knowing how to effectively record audio from a phone is valuable for mobile developers, but is especially essential for apps that use services like Dolby.io to process media. The wide range of hardware in Android devices can make it difficult to develop applications that need to capture and play back audio. However, Android provides several media frameworks to abstract the audio recording process, and others have developed external libraries that make it possible to incorporate high-performance audio into an application. This article will explain how to use the MediaRecorder, MediaPlayer, AudioRecord, and AudioTrack frameworks in Java, touch briefly on other options for audio capture and playback, and lastly will review their respective pros and cons. 

Note: You can find a sample application that uses these classes in the dolbyio-samples/blog-android-audio-recording-examples repository.

What is MediaRecorder?

MediaRecorder is Android’s high-level framework for capturing audio and/or video. It records to a file directly, which can then be played back using MediaPlayer (covered later in this post). An application specifies several parameters, namely the encoding and the file location, and MediaRecorder handles the rest. While relatively simple to configure, MediaRecorder offers minimal customizability, and is best for simple use cases when audio is not central to the functionality of the app. 

How to use MediaRecorder

The following steps and code samples demonstrate how to use MediaRecorder to record audio to a file in an application’s internal storage. If you need to record video as well, see the official Android guide on the Camera API

This example, as well as the following examples for other audio classes, will follow a general outline with steps as defined below: 

  1. Declare permissions (only needed once for an application)
  2. Instantiate/configure the object
  3. Attempt to start recording/playback

As with all applications that need access to an audio input, declare the permission in the AndroidManifest.xml file. For API level 23 and above, the application should request permission to record audio the first time the user interacts with it.

<uses-permission android:name="android.permission.RECORD_AUDIO" />

Next, instantiate MediaRecorder and set the necessary properties. Most commonly, this consists of setting an audio source, an output format, a file path, and an encoder. The order in which these calls are made matters; note that the file path and audio encoders can only be specified after an output format is set. Failure to pay attention to this order may result in IllegalStateExceptions being thrown. 

After the above properties are configured, the app should attempt to prepare the MediaRecorder object. If preparation is successful, the MediaRecorder is ready to use and can be started. Once started, the recording can be stopped, paused, and/or subsequently resumed. The following state diagram depicts how the different method calls transition the MediaRecorder between different phases.

Below is an example of how we might use MediaRecorder in a simple application. The startRecording() method can be called inside the onClickListener of a button, and takes as a parameter the location where the recording should be saved – for internal application storage, this can be obtained by calling getFilesDir().getPath() and appending a file name and format.

Note: Many audio sources apply at least minimal processing to the raw stream by default. To record only the unprocessed signal, use the UNPROCESSED source, or VOICE_RECOGNITION for older devices that don’t support the unprocessed property.

protected MediaRecorder recorder;
 
private void startRecording(String fileName) {
        // initialize and configure MediaRecorder
        recorder = new MediaRecorder();
        recorder.setAudioSource(MediaRecorder.AudioSource.MIC);
        recorder.setOutputFile(fileName);
        recorder.setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP);
        recorder.setAudioEncoder(MediaRecorder.AudioEncoder.AAC);
 
        try {
            recorder.prepare();
        }
        catch (IOException e) {
            // handle error
        }
        catch (IllegalStateException e) {
            // handle error
        }
 
         
        recorder.start();
}

When the recording is finished, the application should release resources back to the operating system as soon as possible. The code below demonstrates how to properly end a recording. 

private void stopRecording() {
        // stop recording and free up resources
        recorder.stop();
        recorder.release();
 
        recorder = null;
}

The final recording can be found at the storage location specified by the file path. To play back this recording, we can use MediaPlayer. 

What is MediaPlayer?

MediaPlayer is MediaRecorder’s counterpart on Android. Given a URI, URL, or reference to a file, it plays audio or video with both minimal setup and minimal customizability. Once initialized, MediaPlayer can be started, paused, and stopped, providing straightforward playback. For a complete list of media formats supported by the Android platform, see the official documentation

How to use MediaPlayer

An instance of MediaPlayer can be initialized in one of two ways: by instantiating it with the corresponding constructor and configuring the object, or by calling a convenience method create() that takes a data source. If choosing the latter, be aware that the create() method prepares the media synchronously, which may cause the UI thread to freeze. The examples in this section will demonstrate the first method. 

As with MediaRecorder, MediaPlayer has a myriad of different possible configurations and states that should be managed carefully to avoid errors. Refer to both the state diagram below and the rest of the official documentation. 

After instantiating the MediaPlayer object, the application can set any desired audio attributes, such as specifying the media usage and content. A data source must then be set, which can be a raw resource directly from your application, a path to an audio file in internal storage, or a URL of media to be streamed over the internet. Only after setting a data source can the MediaPlayer be prepared. For local files, this is acceptable to do synchronously, but for streaming purposes, the MediaPlayer should be prepared asynchronously and an OnPreparedListener should be set. The code below continues where our MediaRecorder example left off, showing how to play an audio file saved in an app’s internal storage. 

protected MediaPlayer player;
 
 
private void startPlaying(String filePath) {
        player = new MediaPlayer();
        try {
            player.setDataSource(filePath); // pass reference to file to be played
            player.setAudioAttributes(new AudioAttributes.Builder().setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
                                                                   .setUsage(AudioAttributes.USAGE_MEDIA)
                                                                   .build()); // optional step
            player.prepare(); // may take a while depending on the media, consider using .prepareAsync() for streaming
        }
        catch (IOException e) { // we need to catch both errors in case of invalid or inaccessible resources
            // handle error
        }
        catch (IllegalArgumentException e) {
            // handle error
        }
 
         
        player.start();
}

Like with MediaRecorder, it’s good practice to release the resources MediaPlayer uses once finished. When some user action, like a button press, stops playback, playback can be stopped as shown below. Otherwise, set an OnCompletionListener for MediaPlayer to release resources once the player reaches the end of the media source. 

private void stopPlaying() {
        player.stop();
        player.release(); // free up resources
 
        player = null;
}

What is AudioRecord?

AudioRecord removes a layer of abstraction between the application and a device’s audio hardware, recording uncompressed audio with no way to write directly to a file. These APIs are the lowest level audio framework for Android that can still be used in the Java and Kotlin layer. While MediaRecorder performs its data writing operations inside a black box, AudioRecord requires an application to periodically read the newest audio data from the AudioRecord object’s internal buffer. While this lower-level framework creates more complexity, it also allows applications to build more advanced audio functionality. 

How to use AudioRecord

After the application declares and obtains permission to record audio, an AudioRecord object can be initialized by passing several parameters into the constructor. (Alternatively, AudioRecord.Builder can be used, but the process is essentially the same for both). The constructor takes flags indicating the audio source, the sample rate in Hertz, whether the channel configuration is stereo or mono, and the size of the internal buffer for audio data. There are a few settings that are guaranteed to be supported on all Android devices; refer to the documentation to maximize compatibility.

Note: Make use of the getMinBufferSize() method when creating an AudioRecord object to ensure the internal buffer is sufficiently large given the device hardware. Also be sure to check that initialization didn’t fail silently before continuing. Both of these are demonstrated in the code sample below. Also note that we plan to write byte arrays, so we use 8-bit encoding, but be careful – this encoding is not necessary supported across devices. Use short arrays and 16-bit encoding to be fully sure of compatibility. 

static final int AUDIO_SOURCE = MediaRecorder.AudioSource.MIC; // for raw audio, use MediaRecorder.AudioSource.UNPROCESSED, see note in MediaRecorder section
static final int SAMPLE_RATE = 44100;
static final int CHANNEL_CONFIG = AudioFormat.CHANNEL_IN_MONO;
static final int AUDIO_FORMAT = AudioFormat.ENCODING_PCM_8BIT;
static final int BUFFER_SIZE_RECORDING = AudioRecord.getMinBufferSize(SAMPLE_RATE, CHANNEL_CONFIG, AUDIO_FORMAT);
 
protected AudioRecord audioRecord;
 
private void startRecording() {
 
        audioRecord = new AudioRecord(AUDIO_SOURCE, SAMPLE_RATE, CHANNEL_CONFIG, AUDIO_FORMAT, BUFFER_SIZE_RECORDING);
 
        if (audioRecord.getState() != AudioRecord.STATE_INITIALIZED) { // check for proper initialization
            Log.e(TAG, "error initializing " + e.printStackTrace());
            return;
        }
 
        audioRecord.startRecording();
         
}

To actually obtain the audio data, a separate thread should be dedicated to polling the AudioRecord object to avoid freezing the app’s UI. Create a new Thread with a custom implementation of Runnable. Inside the run() method, we continuously read data from AudioRecord into a buffer until an external event (e.g., the press of a button) indicates we should stop. In the code below, after the AudioRecord object has started recording, the writeAudioData() method is called inside the run() method of the thread we just made. While this post won’t go into detail on how to use threading, you can read through this Android guide to learn more. 

We take the raw audio bytes and write them to a file. AudioRecord uses PCM encoding, so that’s the kind of data the file will hold. The read() method fills the passed in array with the amount of bytes requested in the third parameter, and returns the amount of bytes successfully read. The thread will block until enough samples have been captured to deliver the requested number, meaning that whatever operations are done with the filled buffer should be completed before the next batch of samples is ready. Here, we perform a file write for the sake of simplicity, but to ensure that no data is missed, we would ideally handle this writing in a separate thread with an independent buffer. 

As with all media capture frameworks, don’t forget to release the resources used by AudioRecord when finished. 

private void writeAudioData(String fileName) { // to be called in a Runnable for a Thread created after call to startRecording()
 
        byte[] data = new byte[BUFFER_SIZE_RECORDING/2]; // assign size so that bytes are read in in chunks inferior to AudioRecord internal buffer size
 
        FileOutputStream outputStream = null;
 
        try {
            outputStream = new FileOutputStream(fileName); //fileName is path to a file, where audio data should be written
        } catch (FileNotFoundException e) {
            // handle error
        }
 
        while (continueRecording) { // continueRecording can be toggled by a button press, handled by the main (UI) thread
            int read = audioRecord.read(data, 0, data.length);
            try {
                outputStream.write(data, 0, read);
            }
            catch (IOException e) {
                Log.d(TAG, "exception while writing to file");
                e.printStackTrace();
            }
        }
 
        try {
            outputStream.flush();
            outputStream.close();
        }
        catch (IOException e) {
            Log.d(TAG, "exception while closing output stream " + e.toString());
            e.printStackTrace();
        }
 
        // Clean up
        audioRecord.stop();
        audioRecord.release();
        audioRecord = null;
 
    }

What is AudioTrack?

Just like MediaRecorder and MediaPlayer, AudioRecord and AudioTrack can be used in tandem. The flow of data is pretty much the opposite of that for AudioRecord – PCM data from a file or other source is periodically pushed by the application to the AudioTrack object, which sends it to the device’s hardware to be consumed and played. AudioTrack can be used to either stream audio continuously or play short sounds that fit in memory (for example, sound effects in a mobile game). 

How to use AudioTrack

Construct an instance of AudioTrack by passing the constructor parameters to configure the object, similar to AudioRecord. For API levels less than 21, the constructor takes specifications like the sample rate channel configuration type, and optionally a session ID to control which AudioEffects are applied to specific instances of AudioTrack or other media players. For newer API levels, however, this constructor is deprecated, and an application should use either AudioTrack.Builder (API 23) or an AudioTrack constructor that takes AudioAttributes and AudioFormat objects (API 21).

The example below uses the constructor method to initialize AudioTrack. While the static variables may look identical to the ones declared for AudioRecord, note the differences in the channel configuration (CHANNEL_OUT_MONO) and buffer size (AudioTrack.getMinBufferSize()) flags that indicate these parameters are used for output, not input. 

Note: The same disclaimer above applies – we’re using 8-bit encoding at the risk of losing support on some Android devices, just to simplify the data writing process. One last thing to note in the constructor of AudioTrack: the media being played is from a file, too large to fit in memory, so the streaming mode is more appropriate for this use case. 

static final int SAMPLE_RATE = 44100;
static final int CHANNEL_CONFIG = AudioFormat.CHANNEL_OUT_MONO;
static final int AUDIO_FORMAT = AudioFormat.ENCODING_PCM_8BIT;
static final int BUFFER_SIZE_PLAYING = AudioTrack.getMinBufferSize(SAMPLE_RATE, CHANNEL_CONFIG, AUDIO_FORMAT);
 
protected AudioTrack audioTrack;
 
private void startPlaying() {
 
    AudioAttributes audioAttributes = new AudioAttributes.Builder()
                                    .setContentType(AudioAttributes.CONTENT_TYPE_SPEECH) // defines the type of content being played
                                    .setUsage(AudioAttributes.USAGE_MEDIA) // defines the purpose of why audio is being played in the app
                                    .build();
 
    AudioFormat audioFormat = new AudioFormat.Builder()
                            .setEncoding(AudioFormat.ENCODING_PCM_8BIT) // we plan on reading byte arrays of data, so use the corresponding encoding
                            .setSampleRate(SAMPLE_RATE)
                            .setChannelMask(AudioFormat.CHANNEL_OUT_MONO)
                            .build();
 
    audioTrack = new AudioTrack(audioAttributes, audioFormat, BUFFER_SIZE_PLAYING, AudioTrack.MODE_STREAM, AudioManager.AUDIO_SESSION_ID_GENERATE);
 
}

To push data to the AudioTrack object to be played, we can follow the same pattern as we did for AudioRecord. Inside a dedicated thread, override the run method and make a call to readAudioData(), which does the bulk of the work. In it, we open a file input stream to read bytes of data from a recording, and write that data to the AudioTrack object. The write method is overloaded; see the documentation for other ways to give data to AudioTrack. Lastly, clean up memory and resources after playback has finished. 

private void readAudioData(String fileName) { // fileName is the path to the file where the audio data is located
 
        byte[] data = new byte[BUFFER_SIZE_PLAYING/2]; // small buffer size to not overflow AudioTrack's internal buffer
 
        FileInputStream fileInputStream = null;
 
        try {
            fileInputStream = new FileInputStream(new File(fileName));
        }
        catch (IOException e) {
            // handle exception
        }
 
        int i = 0;
        while (i != -1) { // run until file ends
            try {
                i = fileInputStream.read(data);
                audioTrack.write(data, 0, i);
            }
            catch (IOException e) {
                // handle exception
            }
        }
 
        try {
            fileInputStream.close();
        }
        catch (IOException e) {
            // handle exception
        }
 
        audioTrack.stop();
        audioTrack.release();
        audioTrack = null;
}

Other options

While MediaRecorder and AudioRecorder are the only built-in ways to record audio, they are by no means the only ones available to Android developers. Widely used libraries include ExoPlayer as an alternative to MediaPlayer and several C++ libraries for high performance audio. 

ExoPlayer is an open source library for media playback, maintained by Google but not distributed as part of the Android SDK. Its structure is easily extendable and has features that are especially useful for streaming media over the internet. An instance of ExoPlayer takes custom MediaSource objects that can be built to correspond to the type and properties of the media, allowing an app to create custom configurations and maximize quality. To explore more capabilities of ExoPlayer, refer to the official developer guides. If your audio flow includes recording using AudioRecord and playing back using ExoPlayer, keep in mind that ExoPlayer doesn’t support PCM-encoded files (an easy fix is to add a WAV header to the raw file). 

For applications where low latency and/or high performance is vital to audio features, you could consider using libraries written in C or C++ and incorporating them in your app using the Android NDK toolset. OpenSL ES is an API that operates at a lower level, standardizing audio functionality access across platforms and allowing applications to use hardware acceleration. OpenSL ES is ideal for multimedia creation apps (synthesizers, DJ apps, etc), mobile games, and similar applications. The Android NDK comes with its own OS-specific implementation of OpenSL ES. Also on the C side, AAudio is a relatively new API released by Google with similar use cases as OpenSL ES, designed to be fast and minimalist. Applications read or write data to AAudio streams, which are connected to pieces of audio hardware. 

If compatibility across API levels is important for your application, Oboe is a C++ wrapper that switches between the OpenSL ES and AAudio APIs to give the best performance for a specific Android device’s hardware. Google encourages developers to consider using Oboe for real-time audio applications to take advantage of AAudio’s features while maintaining backwards compatibility. 

What should I use?

After learning about the multitudes of options for audio recording and playback, the natural question is what to use for your specific application. Picking between frameworks and APIs always comes with tradeoffs, and the decision between audio libraries on Android is no different. For applications where performance isn’t a priority, or audio makes up a small component of the functionality, MediaRecorder and MediaPlayer might be an ideal combination to capture and play back audio without writing much complex code. Keep in mind that only the most common audio formats are supported, and the application won’t have access to the audio data as it’s being recorded. If you’re looking to perform some audio processing or otherwise need real-time audio, consider using AudioRecord and AudioTrack. The process of reading and writing data is more involved than using MediaRecorder, and any compression or transcoding can’t be done using the AudioRecord APIs. Another option is using AudioRecord to capture audio, then integrating ExoPlayer into the app for more extendable playback features. Lastly, C and C++ libraries facilitate the development of high performance audio applications, but require more domain-specific knowledge to use (as well as knowledge of a C-based language). Out of these libraries, Oboe is a good option – well-maintained with an active developer community. 

Here’s a table to see at a quick glance which option might be appropriate for your application.

MediaRecorder
/ MediaPlayer
AudioRecord
/ AudioTrack
ExoPlayerOpenSLESAAudio
What works out of the box
LanguageJava/KotlinJava/KotlinJava/KotlinCC
High Performance
Access to Recording Audio Buffer
Video
Backwards-compatible

Ultimately, the best audio library to use will vary drastically depending on the scope and features of your application. 

This article has outlined how to use several of the most common Android frameworks for audio, but plenty of other guides and resources exist online to help you learn more. Best of luck on your journey to create great audio experiences within your Android application! 

Attribution: Portions of this page are reproduced from work created and shared by the Android Open Source Project and used according to terms described in the  Creative Commons 3.0 Attribution License.

Tags: android
RELATED POSTS
DEVELOPER
INTERACTIVITY
Generate Access Tokens Using AWS Services

Tips to secure your API Key by using AWS services to generate initialization tokens.

Katie Gray
|
aws
security
MEDIA
PRODUCT
How to Choose the Highest Quality Audio Format for Your Project

Learn which audio formats work for different mediums.

Dolby.io
|
audio-engineering
DEVELOPER
INTERACTIVITY
Generate a Transcript of Your Dolby.io Meeting with Symbl.ai

How-to get a transcript of a video conference recording using Symbl.ai.

Jayson DeLancey
|
transcription
We're happy to chat about our APIs, SDKs...or magic.