Optimize audio pathway #10079

relic-se · 2025-02-19T03:04:32Z

Currently, each object which utilizes the audiosample API is required to support multiple formats and manage common audio tasks independently. In order to reduce flash size, complexity, and the barrier to entry to create new audiosample objects (namely audio effects), the audio pathway should be streamlined by using a single format and adding shared resources for general audio processing tasks. These optimizations include but are not limited to the following lists. Feel free to suggest other areas of improvement.

For all audio objects:

Remove single_channel_output (see Remove single_channel_output from the audio API layer #9877)
Limit the audio format to signed 16-bit integers
Use bytes_per_sample instead of bits_per_sample (most all calculations are on the byte-level)

For audio effects objects:

Unify double-buffering and sample processing implementation, including lfo ticking at regular intervals
Provide shared mixing implementation (linear and cross-fade)

If these updates are carried out, some audio object constructor arguments and properties may need to be deprecated such as bits_per_sample and samples_signed as they will be dictated by the output. sample_rate and channel_count are likely to still be necessary per object.

This issue was inspired by the discussion within #10052 with @jepler and @gamblor21.

The text was updated successfully, but these errors were encountered:

gamblor21 · 2025-02-23T16:09:05Z

Thanks for creating this issue. When I first started on effects there was discussion about a common utility library (in one of the base audio objects) and I think now that we see the commonalities it would be a good time to revisit that idea.

I really like the idea of a common internal audio pathway format for CircuitPython (16-bit, signed, stereo would be my vote). I think we have to leave sample rate variable but I have never seen code saying if sample_rate == 44.1khz required. That said we will have to put some thought to ensure older chipsets can still do basic audio that may be 8 bit unsigned, mono.

The other concept I'd like to consider is how we use buffers per audio object. They are not huge but enough of them could add-up for RAM usage and I'm not sure the benefit. In audiomixer the initial reason was to provide a buffer in case the core was busy so the audio did not stutter. But having a buffer per audio component does not help that any more then 1 buffer. Is this a case of a "buffer" object like mixer at the root of the audio (requesting say 256 byte samples from the rest) or something else I'm not sure on.

If that isn't clear what I'm envisioning is:
audio output <- audio-mixer (has buffer) <- effect1 (no buffer) <- effect2 (no buffer) <- source (buffer if required say mp3, or none for say a synth

Another potential benefit is if we standardize on a set buffer size (like synth has) the user can choose what works for them as a larger buffer can take away time from the core processing as everything stops as we run the audio chain.

tl;dr

Create a audio utility library, potentially in a main audio object and start to move common functionality there
Pick a standard internal use format
Ensure older chipsets can still do basic audio (wave->audio object)
Determine if a single location for the double buffer allocation is a good idea to save RAM
Try not to break anything!

relic-se · 2025-02-24T00:17:42Z

I really like the idea of a common internal audio pathway format for CircuitPython (16-bit, signed, stereo would be my vote).

I think that's the only real option here.

I think we have to leave sample rate variable but I have never seen code saying if sample_rate == 44.1khz required.

There are any specific sample rate checks that I'm aware of, but a lot of the audiodelays effects use sample rate to calculate timing properties.

That said we will have to put some thought to ensure older chipsets can still do basic audio that may be 8 bit unsigned, mono.

Those chipsets will have to put some work into converting the signed 16-bit buffers into 8-bit unsigned output. There may be other potential platforms, but I think the ESP32 audioio.AudioOut with its built-in 8-bit DAC should be the target for that testing. (Although only RP2350 is enabled for these libraries at the moment.)

If that isn't clear what I'm envisioning is: audio output <- audio-mixer (has buffer) <- effect1 (no buffer) <- effect2 (no buffer) <- source (buffer if required say mp3, or none for say a synth

The audiosample API currently operates with buffered output in mind, so I'm not sure if a change like this would result in a change to the current ..._get_buffer(...) implementation. Otherwise, I am a fan!

Create a audio utility library, potentially in a main audio object and start to move common functionality there

Tbh, I've always thought that these effects would make more sense in a shared audioeffects module regardless of type. Especially because it's been difficult to pair certain effects together without adding another module be to the core that would otherwise be superfluous. This change might make it easier to share common functionality.

Try not to break anything!

Let's hope! But a lot of these changes are going to need to be at least 10.x.x. When doing a restructure like this within CPy, is it typical to wait until a major release or to create the new modules and point the old ones to the new modules with a deprecation notice?

gamblor21 · 2025-03-01T16:37:00Z

Couple other thoughts (not sure if they are great or bad ideas yet):

SYNTHIO_MAX_DUR is referenced often with the LFO ticking. Maybe worth picking a set processing length.
And in that theme maybe then worth moving some LFO stuff outside of synth? Haven't thought too long on that.
Also worth thinking as the zephyr work progresses being able to properly schedule these actions (rather then running in RUN_BACKGROUND_TASKS) should be considered.

gamblor21 · 2025-03-08T15:17:42Z

A couple of thoughts on internal sample data type (that would have to target CP 10):

Using floating-point likely is too much (but maybe not?) for audio samples, but in many cases we use int32 in a module and then compress into int16 for output. Would it make sense to use int32 in the entire internal processing chain.
If int32 is used for samples, a step is then required to go to int16 for output (or 8) and then we could give options. E.g. hard clipping, the current synthio mix down, or other compression algorithms that are out there.
By moving the compression to the end we avoid doing it multiple times during audio processing. Both saving CPU cycles and audio artifacts introduced as a result of multiple compressions.
I have not looked yet but I think a hook into the audio output objects would work (with a default algorithm in place).

relic-se added the enhancement label Feb 19, 2025

relic-se changed the title ~~Optimize audio pathway to reduce complexity~~ Optimize audio pathway Feb 19, 2025

relic-se mentioned this issue Feb 19, 2025

Pitch shift audio effect #10052

Merged

tannewt added the audio label Feb 19, 2025

tannewt added this to the Long term milestone Feb 19, 2025

tannewt added the breaks api label Feb 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize audio pathway #10079

Optimize audio pathway #10079

relic-se commented Feb 19, 2025

gamblor21 commented Feb 23, 2025

relic-se commented Feb 24, 2025

gamblor21 commented Mar 1, 2025

gamblor21 commented Mar 8, 2025

Optimize audio pathway #10079

Optimize audio pathway #10079

Comments

relic-se commented Feb 19, 2025

gamblor21 commented Feb 23, 2025

relic-se commented Feb 24, 2025

gamblor21 commented Mar 1, 2025

gamblor21 commented Mar 8, 2025