Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize audio pathway #10079

Open
relic-se opened this issue Feb 19, 2025 · 4 comments
Open

Optimize audio pathway #10079

relic-se opened this issue Feb 19, 2025 · 4 comments

Comments

@relic-se
Copy link

Currently, each object which utilizes the audiosample API is required to support multiple formats and manage common audio tasks independently. In order to reduce flash size, complexity, and the barrier to entry to create new audiosample objects (namely audio effects), the audio pathway should be streamlined by using a single format and adding shared resources for general audio processing tasks. These optimizations include but are not limited to the following lists. Feel free to suggest other areas of improvement.

For all audio objects:

For audio effects objects:

  • Unify double-buffering and sample processing implementation, including lfo ticking at regular intervals
  • Provide shared mixing implementation (linear and cross-fade)

If these updates are carried out, some audio object constructor arguments and properties may need to be deprecated such as bits_per_sample and samples_signed as they will be dictated by the output. sample_rate and channel_count are likely to still be necessary per object.

This issue was inspired by the discussion within #10052 with @jepler and @gamblor21.

@relic-se relic-se changed the title Optimize audio pathway to reduce complexity Optimize audio pathway Feb 19, 2025
@tannewt tannewt added the audio label Feb 19, 2025
@tannewt tannewt added this to the Long term milestone Feb 19, 2025
@gamblor21
Copy link
Member

Thanks for creating this issue. When I first started on effects there was discussion about a common utility library (in one of the base audio objects) and I think now that we see the commonalities it would be a good time to revisit that idea.

I really like the idea of a common internal audio pathway format for CircuitPython (16-bit, signed, stereo would be my vote). I think we have to leave sample rate variable but I have never seen code saying if sample_rate == 44.1khz required. That said we will have to put some thought to ensure older chipsets can still do basic audio that may be 8 bit unsigned, mono.

The other concept I'd like to consider is how we use buffers per audio object. They are not huge but enough of them could add-up for RAM usage and I'm not sure the benefit. In audiomixer the initial reason was to provide a buffer in case the core was busy so the audio did not stutter. But having a buffer per audio component does not help that any more then 1 buffer. Is this a case of a "buffer" object like mixer at the root of the audio (requesting say 256 byte samples from the rest) or something else I'm not sure on.

If that isn't clear what I'm envisioning is:
audio output <- audio-mixer (has buffer) <- effect1 (no buffer) <- effect2 (no buffer) <- source (buffer if required say mp3, or none for say a synth

Another potential benefit is if we standardize on a set buffer size (like synth has) the user can choose what works for them as a larger buffer can take away time from the core processing as everything stops as we run the audio chain.

tl;dr

  • Create a audio utility library, potentially in a main audio object and start to move common functionality there
  • Pick a standard internal use format
  • Ensure older chipsets can still do basic audio (wave->audio object)
  • Determine if a single location for the double buffer allocation is a good idea to save RAM
  • Try not to break anything!

@relic-se
Copy link
Author

I really like the idea of a common internal audio pathway format for CircuitPython (16-bit, signed, stereo would be my vote).

I think that's the only real option here.

I think we have to leave sample rate variable but I have never seen code saying if sample_rate == 44.1khz required.

There are any specific sample rate checks that I'm aware of, but a lot of the audiodelays effects use sample rate to calculate timing properties.

That said we will have to put some thought to ensure older chipsets can still do basic audio that may be 8 bit unsigned, mono.

Those chipsets will have to put some work into converting the signed 16-bit buffers into 8-bit unsigned output. There may be other potential platforms, but I think the ESP32 audioio.AudioOut with its built-in 8-bit DAC should be the target for that testing. (Although only RP2350 is enabled for these libraries at the moment.)

If that isn't clear what I'm envisioning is: audio output <- audio-mixer (has buffer) <- effect1 (no buffer) <- effect2 (no buffer) <- source (buffer if required say mp3, or none for say a synth

The audiosample API currently operates with buffered output in mind, so I'm not sure if a change like this would result in a change to the current ..._get_buffer(...) implementation. Otherwise, I am a fan!

Create a audio utility library, potentially in a main audio object and start to move common functionality there

Tbh, I've always thought that these effects would make more sense in a shared audioeffects module regardless of type. Especially because it's been difficult to pair certain effects together without adding another module be to the core that would otherwise be superfluous. This change might make it easier to share common functionality.

Try not to break anything!

Let's hope! But a lot of these changes are going to need to be at least 10.x.x. When doing a restructure like this within CPy, is it typical to wait until a major release or to create the new modules and point the old ones to the new modules with a deprecation notice?

@gamblor21
Copy link
Member

Couple other thoughts (not sure if they are great or bad ideas yet):

  • SYNTHIO_MAX_DUR is referenced often with the LFO ticking. Maybe worth picking a set processing length.
  • And in that theme maybe then worth moving some LFO stuff outside of synth? Haven't thought too long on that.
  • Also worth thinking as the zephyr work progresses being able to properly schedule these actions (rather then running in RUN_BACKGROUND_TASKS) should be considered.

@gamblor21
Copy link
Member

A couple of thoughts on internal sample data type (that would have to target CP 10):

  • Using floating-point likely is too much (but maybe not?) for audio samples, but in many cases we use int32 in a module and then compress into int16 for output. Would it make sense to use int32 in the entire internal processing chain.
  • If int32 is used for samples, a step is then required to go to int16 for output (or 8) and then we could give options. E.g. hard clipping, the current synthio mix down, or other compression algorithms that are out there.
  • By moving the compression to the end we avoid doing it multiple times during audio processing. Both saving CPU cycles and audio artifacts introduced as a result of multiple compressions.
  • I have not looked yet but I think a hook into the audio output objects would work (with a default algorithm in place).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants