Skip to content

Encoding and Decoding of Audio

Phil Schatzmann edited this page Oct 10, 2024 · 113 revisions

Compressed Audio

Unfortunately the available memory on Microcontrollers is quite restricted and we do not get very far by storing a (uncompressed) WAV file e.g. in program/flesh memory, so I started to look into compressed audio formats: The compression and decompression can be done with the help of Codecs. Codecs are also important if you need to transmit audio data at a high sampling rate over a line with limited capacity.

On the desktop we can use the FFmpeg project which comes with a rich set of functionality. Unfortunately the situation is much more fragmented for Microcontrollers.

I started to collect the relevant libraries and in order to make things simple to use I also added a simple C++ API on top of the available libraries:

Supported Codecs

Library Class Include Function Format Application
- DecoderL8 AudioCodecs/CodecL8.h Decoding PCM Audio
- EncoderL8 AudioCodecs/CodecL8.h Encoding PCM Audio
- DecoderL16 AudioCodecs/CodecL16.h Decoding PCM Audio
- EncoderL16 AudioCodecs/CodecL16.h Encoding PCM Audio
- DecoderFloat AudioCodecs/CodecFloat.h Decoding PCM Audio
- EncoderFloat AudioCodecs/CodecFloat.h Encoding PCM Audio
- WAVDecoder AudioCodecs/CodecWAV.h Decoding WAV Audio
- WAVEncoder AudioCodecs/CodecWAV.h Encoding WAV Audio
- WavIMAEncoder AudioCodecs/CodecWavIMA.h Decoding WAV/IMA Audio
- DecoderBase64 AudioCodecs/CodecBase64.h Decoding Base64
- EncoderBase64 AudioCodecs/CodecBase64.h Encoding Base64
libhelix MP3DecoderHelix AudioCodecs/CodecMP3Helix.h Decoding MP3 Audio
libmad MP3DecoderMAD AudioCodecs/CodecMP3MAD.h Decoding MP3 Audio
minimp3 MP3DecoderMini AudioCodecs/CodecMP3Mini.h Decoding MP3 Audio
liblame MP3EncoderLAME AudioCodecs/CodecMP3LAME.h Encoding MP3 Audio
libhelix AACDecoderHelix AudioCodecs/CodecAACHelix.h Decoding AAC Audio
libfaad AACDecoderFAAD AudioCodecs/CodecAACFAAD.h Decoding AAC Audio
fdk-aac AACDecoderFDK AudioCodecs/CodecAACFDK.h Decoding AAC Audio
fdk-aac AACEncoderFDK AudioCodecs/CodecAACFDK.h Encoding AAC Audio
libflac FLACDecoder AudioCodecs/CodecFLAC.h Decoding FLAC Audio
libflac FLACEncoder AudioCodecs/CodecFLAC.h Encoding FLAC Audio
libvorbis-tremor VorbisDecoder AudioCodecs/CodecVorbis.h Decoding OGG Vorbis Audio
libsbc SBCDecoder AudioCodecs/CodecSBC.h Decoding SBC Audio
libsbc SBCEncoder AudioCodecs/CodecSBC.h Encoding SBC Audio
liblc3 LC3Decoder AudioCodecs/CodecLC3.h Decoding LC3 Audio
liblc3 LC3Encoder AudioCodecs/CodecLC3.h Encoding LC3 Audio
libopenaptx APTXDecoder AudioCodecs/CodecAPTX.h Decoding APTX Audio
libopenaptx APTXEncoder AudioCodecs/CodecAPTX.h Encoding APTX Audio
libopus OpusDecoder AudioCodecs/CodecOpus.h Decoding Opus Audio
libopus OpusEncoder AudioCodecs/CodecOpus.h Encoding Opus Audio
libopus OpusOggDecoder AudioCodecs/CodecOpusOgg.h Decoding Opus Audio
libopus OpusOggEncoder AudioCodecs/CodecOpusOgg.h Encoding Opus Audio
adpcm ADPCMDecoder AudioCodecs/CodecADPCM.h Decoding ADPCM Audio
adpcm ADPCMEncoder AudioCodecs/CodecADPCM.h Encoding ADPCM Audio
adpcm-xq ADPCMDecoderXQ AudioCodecs/CodecADPCM.h Decoding ADPCM Audio
adpcm-xq ADPCMEncoderXQ AudioCodecs/CodecADPCM.h Encoding ADPCM Audio
libgsm GSMDecoder AudioCodecs/CodecGSM.h Decoding GSM Speech
libgsm GSMEncoder AudioCodecs/CodecGSM.h Encoding GSM Speech
libg7xx G711_ALAWDecoder AudioCodecs/CodecG7xx.h Decoding ALAW Speech
libg7xx G711_ALAWEncoder AudioCodecs/CodecG7xx.h Encoding ALAW Speech
libg7xx G711_ULAWDecoder AudioCodecs/CodecG7xx.h Decoding ULAW Speech
libg7xx G711_ULAWEncoder AudioCodecs/CodecG7xx.h Encoding ULAW Speech
libg7xx G721Decoder AudioCodecs/CodecG7xx.h Decoding G.721 Speech
libg7xx G721Encoder AudioCodecs/CodecG7xx.h Encoding G.721 Speech
libg722 G722Decoder AudioCodecs/CodecG722.h Decoding G.722 Speech
libg722 G722Encoder AudioCodecs/CodecG722.h Encoding G.722 Speech
libg7xx G723_24Decoder AudioCodecs/CodecG7xx.h Decoding G.723 Speech
libg7xx G723_24Encoder AudioCodecs/CodecG7xx.h Encoding G.723 Speech
libg7xx G723_40Decoder AudioCodecs/CodecG7xx.h Decoding G.723 Speech
libg7xx G723_40Encoder AudioCodecs/CodecG7xx.h Encoding G.723 Speech
codec2 Codec2Decoder AudioCodecs/CodecCodec2.h Decoding Codec2 Speech
codec2 Codec2Encoder AudioCodecs/CodecCodec2.h Encoding Codec2 Speech

Container

Library Class Include Function Format Application
libopus OggContainerDecoder AudioCodecs/ContainerOgg.h Decoding Ogg Audio
libopus OggContainerEncoder AudioCodecs/ContainerOgg.h Encoding Ogg Audio
- BinaryContainerEncoder AudioCodecs/ContainerBinary.h Encoding - Audio
- BinaryContainerDecoder AudioCodecs/ContainerBinary.h Decoding - Audio
- AVIDecoder AudioCodecs/ContainerAVI.h Decoding AVI Video
tsdemux MTSDecoder AudioCodecs/CodecMTS.h Decoding video/mp2t MPEG-TS MTS Audio Video

Installation

If you want to use a codec, do not forget that you need to install the related library!

Decoding

Most decoders inherit from AudioDecoder. I am also providing an integration into my Arduino Audio Tools where you can use these libraries with the EncodedAudioStream class:

#include "AudioTools.h"
#include "AudioTools/AudioCodecs/CodecMP3Helix.h"
#include "BabyElephantWalk60_mp3.h"

MemoryStream mp3(BabyElephantWalk60_mp3, BabyElephantWalk60_mp3_len); // MP3 data source
I2SStream i2s; // final output of decoded stream
EncodedAudioStream dec(&i2s, new MP3DecoderHelix()); // Decoding stream
StreamCopy copier(dec, mp3); // copy in to out

void setup(){
  Serial.begin(115200);

  i2s.begin();
  dec.begin();
}

void loop(){
  if (mp3) {
    copier.copy();
  } 
}

The above stream is implementing the following flow: mp3 MemoryStream -copy-> EncodedAudioStream -> I2SStream This method is used by most codecs.

Decoding on the Input Side

You can also decode on the input side, which is less efficient, but sometimes more convenient. This is implementing the following flow: mp3 MemoryStream -> EncodedAudioStream -copy-> I2SStream

#include "AudioTools.h"
#include "AudioTools/AudioCodecs/CodecMP3Helix.h"
#include "BabyElephantWalk60_mp3.h"

MemoryStream mp3(BabyElephantWalk60_mp3, BabyElephantWalk60_mp3_len); // MP3 data source
EncodedAudioStream dec(&mp3, new MP3DecoderHelix()); // Decoding stream
I2SStream i2s; // final output of decoded stream
StreamCopy copier(i2s, mp3); // copy in to out

void setup(){
  Serial.begin(115200);

  i2s.begin();
  dec.addNotifyAudioChange(i2s);
  dec.begin();
}

void loop(){
  copier.copy();
}

Streaming Decoding

Ogg (FLACDecoder) and Vorbis VorbisDecoder are using an alternative method which is pulling the data directly from an input stream:

In this case a StreamingDecoder is used.

#include "AudioTools.h"
#include "AudioTools/AudioCodecs/CodecVorbis.h"

const char* ssid = "ssid";
const char* pwd = "password";
URLStream url(ssid, pwd);
VorbisDecoder dec;
I2SStream i2s;

void setup() {
  Serial.begin(115200);
  AudioLogger::instance().begin(Serial, AudioLogger::Info);  

  i2s.begin(i2s.defaultConfig(TX_MODE));

  url.begin("http://marmalade.scenesat.com:8086/bitjam.ogg","application/ogg");

  // setup decoder
  dec.setInputStream(url);
  dec.setOutputStream(i2s);
  dec.begin();
}

void loop() {
  dec.copy();
}

You can transform any AudioDecoder into a StreamingDecoder with the help of a StreamingDecoderAdapter.

Encoding

The encoding of audio data to a different format is also done with the help of the EncodedAudioStream class. The only difference (to the decoding examples) is that we pass an Encoder as argument.

Here is the related Arduino sketch:

#include "AudioTools.h"
#include "SdFat.h"

uint16_t sample_rate = 44100;
uint8_t channels = 2;                                             // The stream will have 2 channels 
NoiseGenerator<int16_t> noise(32000);                             // subclass of SoundGenerator with max amplitude of 32000
GeneratedSoundStream<int16_t> in_stream(noise);                   // Stream generated from sine wave
SdFat SD;
File audioFile;                                                   // final output stream
WAVEncoder encoder;
EncodedAudioStream out_stream(&audioFile, &encoder);             // encode as wav file
StreamCopy copier(out_stream, in_stream);                                // copies sound to out

void setup(){
  Serial.begin(115200);
  AudioLogger::instance().begin(Serial, AudioLogger::Info);  

  auto cfg = noise.defaultConfig();
  cfg.sample_rate = sample_rate;
  cfg.channels = channels;
  cfg.bits_per_sample = 16;
  noise.begin(cfg);
  in_stream.begin();

  // we need to provide the audio information to the encoder
  out_stream.begin(cfg);
  // open the output file
  SD.begin(SdSpiConfig(PIN_CS, DEDICATED_SPI, SD_SCK_MHZ(2)));
  audioFile = SD.open("/test/002.wav", O_WRITE | O_CREAT);
}

void loop(){
    copier.copy();  
    // audioFile.flush(); // force write down of data
}

We create an input stream which is based on some sound generator. In the out_stream we indicate the final output stream (which is a file in the example) and the encoder that is used when the data is written: GeneratedSoundStream -copy-> EncodedAudioStream -> File

Please note the following:

  • you need to make sure that the file content is written to the file by calling audioFile.close() at the end - or by flushing the individual writes.
  • call out_stream.begin() before you write to a new file, this makes sure that the header is written to the file if necessary (e.g. for WAV files).
  • Before you start to write to a file, delete the file or move to the beginning. Otherwise the content is just appended!
  • This example is using the sdfat library, but you can use any other Arduino file library implementation.
  • MP3 and AAC are quire popular audio format, but they require a lot of memory and are at the edge what a Microcontroller can do. I recommend to avoid them and to prefer a lean format like ADPCM.

Further Information

Clone this wiki locally