Media APIs

Recording Audio on iOS with Examples


Review of media frameworks available for recording audio on iOS.

There are a few different sets of iOS libraries that provide audio recording functionality. It can be confusing to a beginner which API to use. This article briefly shows the purpose of each of these APIs and their pros and cons.

  • AVAudioRecorder
  • AVCapture
  • AVAudioEngine
  • AudioToolbox


AVAudioRecorder is the simplest audio recording API in iOS. You basically tell iOS the path where you want to record to and any audio format settings, such as stereo/mono and AAC/WAV. You then can start and stop the recording at the time when you want.

The disadvantage of it is, it doesn’t provide you any audio sample buffer when it does the recording. That means you can’t analyze or process the audio before it writes to the target file. In addition, it also doesn’t support video recording.

AVAudioRecorder Example

This example startRecording(), needs to decide where to save the audio, and configure the recording settings before starting recording.

We also need to add a way to finishRecording().  We stop the recording then nullify the recording instance.

let filename = URL(string: "out.m4a")!
let settings = [AVEncoderBitRatePerChannelKey: 96000]
func startRecording()
    // This needs to decide where to save the audio,   
    configure the recording settings, then start
    do {
        // See Apple API reference for details of settings 
        audioRecorder = try AVAudioRecorder(url: filename, settings: settings)
        audioRecorder.delegate = self
    } catch {
func finishRecording() {
    audioRecorder = nil

We assign self to be the delegate of the audio recorder.  This means you need to conform your class to the AVAudioRecorderDelegate protocol to handle callbacks from AVAudioRecorder.

func audioRecorderDidFinishRecording(_ recorder: AVAudioRecorder, successfully flag: Bool) {
    // Show in UI the audio recording has stopped
    audioRecordingStopped(successfully: flag)
func audioRecorderEncodeErrorDidOccur(AVAudioRecorder, error: Error?) {
    // Show in UI, error happened in audio recording
    audioReocrding(with error: error)


AVCapture consists of AVCaptureDeviceAVCaptureDeviceInputAVCaptureSession, and AVCaptureOutput. It provides a simple degree of input, output node graph like processing architecture. You’re able to intercept the sample buffer and do certain analysis and processing over it. It also provides the ability to do video capture.

However, it is still not designed for any serious signal analysis and processing. It is missing some desirable configurations.

AVCaptureSession Example

We need to prepare an AVCaptureSession for the audio input device. This example only shows audio capture.

self.captureSession = AVCaptureSession()
let queue = DispatchQueue(label: "AudioSessionQueue", attributes: [])
let captureDevice = AVCaptureDevice.default(for:
var audioInput : AVCaptureDeviceInput? = nil
do {
    try captureDevice?.lockForConfiguration()
    audioInput = try AVCaptureDeviceInput(device: captureDevice!)
} catch {
    // Handle capture devices could not be set

We can then add input to the session.

// Add audio input
if captureSession.canAddInput(audioInput!) {
} else {
    // throw inputsAreInvalid

We also need to prepare session output.  In this example, we use AVCaptureAudioFileOutput, so the output is written to a file. We could also use  AVCaptureAudioDataOutput, if we wanted to intercept the input audio sample buffer. The intercepted data can be used for any processing needs.

audioOutput = AVCaptureAudioFileOutput()
if captureSession.canAddOutput(audioOutput!) {
} else {
    // throw outputIsInvalid

With the capture session running, we can call functions to start and stop recording.

We assign our class as the recording delegate. This implements our class to the AVCaptureFileOutputRecordingDelegate protocol to handle the callback from AVCaptureAudioFileOutput.

func startRecording() {
    guard let captureSession = self.captureSession, captureSession.isRunning else {
        // Error, capture session is not yet ready
    let paths = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)
    let fileUrl = paths[0].appendingPathComponent("output.m4a")
    try? FileManager.default.removeItem(at: fileUrl)
    audioOutput!.startRecording(to: fileUrl, outputFileType: .m4a, recordingDelegate: self)
func stopRecording() {
    guard let captureSession = self.captureSession, captureSession.isRunning else {
        // Error, capture session is not yet ready


AVAudioEngine is a wrapper of the Audio Unit, a low-level software audio module in iOS/macOS. It provides a framework to manage the starting of an audio capture/recording or playback. It is highly recommended by Apple.

You dynamically create an audio graph and processing chain. You then use TAP to get a peek of the audio buffer and can analyze them. To do audio processing, you need to add a processing node to the graph. This library can also be used as an AudioUnit App Extension and act as an audio processing plug-in to other audio apps. It is highly configurable and there are some default processing nodes that are ready to be used.

The disadvantage of it is it doesn’t support video capture natively. It also requires more effort to implement and requires a deeper knowledge of audio engineering with buffer management.

AVAudioEngine Example

We need to prepare an AVAudioEngine instance and the corresponding AVAudioFile for output.

var engine = AVAudioEngine()
var file: AVAudioFile?
var player = AVAudioPlayerNode() // Optional
func prepareAudioOutputFile(for outputURL: URL) {
    file = AVAudioFile(forWriting: outputURL, settings: engine.inputNode.inputFormatForBus(0).settings, error: nil)
    engine.connect(player, to: engine.mainMixerNode, format: engine.mainMixerNode.outputFormatForBus(0)) //configure graph

We may startRecording() as the following . In this way, you may process the buffer before writing it to the output.

func startRecording() {
    engine.inputNode.installTapOnBus(0, bufferSize: 1024, format: engine.mainMixerNode.outputFormatForBus(0)) { (buffer, time) -> Void in
        file?.writeFromBuffer(buffer, error: nil)

To stopRecording() when complete you remove the input node.

func stopRecording() {

Audio Toolbox

Audio Toolbox is a low-level, C programming interface in iOS/macOS. With it you can do recording, playback, and stream parsing.

It is highly configurable but can be difficult to use.  It is missing the ability to manage input, output, and processing nodes.  The Audio Unit Processing Graph Services are deprecated. In addition, it doesn’t support native video capture.

It doesn’t seem to be a framework that is promoted by Apple for most use cases.


If you only need to do an audio-only recording and the real-time audio analysis or processing is not necessary, then AVAudioRecorder is your best choice.

If you need some real-time audio capture analysis, but not significant processing, the AVCapture may be the best fit for audio and video.

If you need serious real-time audio capture analysis and processing. the AVAudioEngine is highly extensible and configurable.

Sample Buffer AccessVideoEffort
Audio ToolboxYesNoVery High

Casper Mok

Staff Engineer

Get Started

Drive real-time interactions and engagement with sub-second latency

We are more than just a streaming solutions provider; we are a technology partner helping you build a streaming ecosystem that meets your goals. Get started for free and as you grow, we offer aggressive volume discounts protecting your margins.

Developer Resources

Explore learning paths and helpful resources as you begin development with

Copy link
Powered by Social Snap