Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement audio normalization using BASS #27793

Open
wants to merge 88 commits into
base: master
Choose a base branch
from

Conversation

smallketchup82
Copy link
Contributor

@smallketchup82 smallketchup82 commented Apr 4, 2024

Closes #1600

Prereqs:

Audio Normalization

Audio Normalization is the process of determining the loudness of audio and standardizing (normalizing) it to a set level. If you normalize a number of audio files to the same level, they will all sound the same volume wise. Normalization is useful for making sure that loud songs get made quieter so that they aren't so loud, and quiet songs get made louder so that they aren't so quiet. This reduces the number of times you will need to change your volume, and yes, it means you can finally play The Big Black without having your eardrums reduced to shreds.

Here's a video showing what Audio Normalization looks like in osu!. In the video, all 3 of my volume sliders are set to max. The beatmap songs are being normalized to -14 LUFS. Try to focus on how every song sounds volume wise, you should find that they all sound about the same in terms of volume no matter the instruments in the song. I find that tech maps work the best when trying to tell what normalization does

normalization.mp4

Description of Approach

So, I decided to take a different approach than the old PR. The old PR tried implementing the algorithm manually which isn't good since we'd have to manually maintain it. This PR uses BASSLoud which is maintained by BASS and will get regular updates. I also noticed a lot of missing things in the old PR, such as recalculating the normalization values in the background. I've taken more or less the same structure as the old PR but improved upon it heavily.

When talking about approach in terms of normalization, this implementation calculates loudness using the ITU-R BS.1770-4 standard, but normalizes to -14 LUFS instead. I've spent a couple hours playing the game on my branch to get a feel for the normalization, and I've found -14 LUFS to best meet my expectations. It makes the audio not too quiet, where the effects are really loud, but also loud enough that 100% volume actually feels like 100% volume. Additionally, -14 LUFS is what Youtube & Spotify normalize to, so I believe that it's probably the best level to normalize to when it comes to music. Whereas -23, as recommended by EBU, is best for movies and TV programmes.

My PR uses Integrated Loudness, meaning the average loudness across the entire song. This is different from more conventional methods of loudness detection, such as True Peak, which normalizes based on the peaks (parts which are loudest) in the song. Integrated Loudness is known for working better than True Peak since it takes into account how the song sounds like. Meaning sudden sounds in a song (say sfx), a sudden increase in vocal volume (such as in a chorus), or a sudden increase in the volume of an instrument (say at the end of a song as a finisher), will not be accounted for. These sudden sounds will not influence how the song is normalized, and won't be the reason for a relatively quiet but suddenly loud song to be made more quiet because of that sudden section. The old pr seemed to calculate and use a variety of methods, including True Peak, which was completely unnecessary.

Description of Changes

This PR adds the ability for osu! to normalize audio tracks to a target level using BassLoud. This implementation goes through realm and tries to find maps which do not have loudness detection values stored, it then calculates these values via a background process. Once these values are calculated, they are stored and cached in Realm. These values are grabbed if they exist when a beatmap's audio track is loaded, if they don't exist, the old global audio reduction is applied (0.8). The values are used to create a Volume Fx filter for the track to normalize it.

Changes to Realm

Since we're now tracking Audio Normalization values in Realm, we need to bump the version otherwise a migration error is thrown. Apart from creating object mappings, the only change to realm here is a bump to the version.

Testing

Testing this is a bit difficult since there are many changes in many areas. It should get exponentially easier as things begin to get merged.

  1. Clone the osu branch in this PR, my ManagedBass branch, and my osu-framework branch.
  2. In osu-framework, pack NativeLibs into a nupkg, then add a nuget source for it. Downgrade the NativeLibs version that osu-framework is using to 1.0.0.
  3. Checkout hwsmm's osu-framework branch
  4. Run the following commands in the osu-framework directory (assuming you cloned everything in the same directory, if not you'll have to adjust these paths yourself):
dotnet remove osu.Framework/osu.Framework.csproj ppy.ManagedBass
dotnet add osu.Framework/osu.Framework.csproj reference ../ManagedBass/src/Bass/Portable/Bass.csproj
dotnet add osu.Framework/osu.Framework.csproj reference ../ManagedBass/src/AddOns/BassLoud/BassLoud.csproj
dotnet restore osu.Framework
  1. Go back to osu! lazer (osu) and run UseLocalFramework.sh. For good luck, run dotnet restore osu.Desktop just to make sure everything is correctly restored
  2. If all is well, you should be able to run dotnet run (or ide equivalent) in osu.Desktop with no problems.

Feel free to ping me in the #osu channel of the osu!dev discord if you need help with this.

Final remarks (notes before reviewing)

Please read this section before reviewing

  • Please backup your realm file before testing!
  • Please go through this commit by commit. I leave a lot of useful information in my commit descriptions. Start at 7a3ccf3
    • Sorry about the commits before that one, I tend not to think about variable names in rough drafts and poc code
  • I intentionally left out tests since this isn't really something you can just slap in a test. It works and tests best when using the full client and playing on the branch for a while.

Developed with help from @hwsmm (thanks!!!)

still wip, committing so i can grab on a diff pc
This temporarily disables it until I can get around to completely
removing it
So I forgot to free the stream when everything's finished. This would cause the BASS stream to exist eternally, resulting in the garbage collector never collecting this, resulting in a ton of ram usage. Additionally, 60k bytes is a lot for something like this. So I reduced the size of the buffer to 10k bytes.
Hopefully fixes a bug I've been noticing. The bug itself is pretty hard to reproduce so I'm just going to add this in hoping that it fixes things. Will likely talk more about it in the PR description
Seems like this is required since we've effectively added a new "column" to realm
Will remove VolumeOffset in a future commit. This is mainly so that you can change the target level and not have to go through recalculating again
So I've played on my branch for a couple of hours and tried different values. -14 LUFS seems to meet my expectations the most.
Plus, youtube and spotify use it so its probably better for music
So while I was looking through the diff, I noticed that I could simply
move the processedcount incrementor above the if...continue statement
and repurpose the notification to being "Verifying loudness levels"
instead of "Calculating loudness levels". This is an easy fix that still
offers a sense of transparency to the user.
- Remove force non-null from effect in WorkingBeatmap
- Use Count property instead of method in BackgroundDataStoreProcessor
- Remove weird import
- Linter seems to not want to inherit the IEquatable of the interface
since the IEquatable of AudioNormalization already inherits the
IEquatable of the interface. Removing it and building shows no issues so
I'll go ahead with it
@smallketchup82
Copy link
Contributor Author

Quick update:
Please don't review/merge this right now as I'm in the process of reviewing smallketchup82#1

Improvements for audio normalization
@smallketchup82
Copy link
Contributor Author

smallketchup82 commented May 20, 2024

Finished reviewing the new implementation myself. This should be in an acceptable state for review from the core team.

I want to raise some concerns, however:

  • This new implementation, while significantly simplifying things, also adds a new setting to normalize audio. We weren't sure whether to keep that in, but it was easy enough to implement that we figured why not. I also wanted some input on Improvements for audio normalization smallketchup82/osu#1 (comment). The original implementation of this new implementation bound hitsound volume adjustment to the beatmap hitsounds setting. Meaning that the volume offset would only be applied to hitsounds if the beatmap hitsounds setting was turned on. I've decided to remove that since, to me, it didn't seem correct. But I want to let the core team call that shot. If keeping that implementation is preferred, it can be added back in.
  • The volume ceiling is... iffy. -14 LUFS makes 100% volume pleasant to listen to, but at the same time, it causes a couple of issues. A lot of players prefer their hitsounds to be louder than the music. Since 100% master volume is already at a level which is considered good/loud enough for your ears, setting music to 50% makes it really quiet.
    • My original idea to fix this was adding another setting called "volume boost" which would increase the target level from -14 to something like -10, allowing the user to increase the maximum output volume of lazer, and therefore enable them to adjust effects and music volume without it being really quiet
    • At the same time, we can just leave this as is, since there's already a setting baked in to disable normalization.
  • As stated in the new implementation's PR, a fallback volume reduction of 0.8 doesn't really match well with -14 LUFS. Turning normalization off results in your ears begging for mercy. A better value for the fallback volume reduction should probably be found.

@smallketchup82
Copy link
Contributor Author

smallketchup82 commented May 22, 2024

Here's an updated video demonstrating normalization in various different scenarios (since its rather hard to test, I figured making this video would be useful to an extent). I would highly recommend making sure that your system volume is at a comfortable level, then watching the video at maximum volume (for the video, not system). This is to give you a feel of what -14 LUFS actually feels like, and what the global reduction feels like in comparison.

I threw it together in 20 minutes in premiere pro, so the quality is horrible (sorry lol)

Audio.Normalization.Demo.reencoded.mp4

@nekodex
Copy link
Contributor

nekodex commented May 22, 2024

Lemme just state up front that I've not looked at or reviewed the code, so take that into account when reading my feedback.

I also wanted some input on Improvements for audio normalization smallketchup82/osu#1 (comment). The original implementation of this new implementation bound hitsound volume adjustment to the beatmap hitsounds setting. Meaning that the volume offset would only be applied to hitsounds if the beatmap hitsounds setting was turned on. I've decided to remove that since, to me, it didn't seem correct. But I want to let the core team call that shot. If keeping that implementation is preferred, it can be added back in.

Hmm after giving this some more thought, I'm not 100% sure.

My initial gut reaction says that maybe the volume adjustments should only apply to beatmap hitsounds (i.e. mapper/custom hitsounds included with beatmaps)? But on the other hand, if a beatmap was created with legacy/lazer hitsounds in mind, then we'd want those to be adjusted too. As the linked comment brings up, it might be a bit weird for hitsounds to change volume between beatmaps, but I think maybe that's fine?

Maybe @ppy can provide a second opinion?

This also begs the question as to what happens if someone maps on lazer with normalization on and then a player plays with normalization off... or vice-versa. I don't really have an answer to this.

  • The noise ceiling is... iffy. -14 LUFS makes 100% volume pleasant to listen to, but at the same time, it causes a couple of issues. A lot of players prefer their hitsounds to be louder than the music. Since 100% master volume is already at a level which is considered good/loud enough for your ears, setting music to 50% makes it really quiet.

Does this PR also remove lazer's current hard-coded global volume reduction? If not, that might be what's making things too quiet. I'm also not sure what curve the volume controls currently use, but we could potentially change that if it's ramping volume down too fast?

  • My original idea to fix this was adding another setting called "volume boost" which would increase the target level from -14 to something like -10, allowing the user to increase the maximum output volume of lazer, and therefore enable them to adjust effects and music volume without it being really quiet

I don't know how the rest of the team feels about adding more user preferences, but we could do something similar to what Spotify has and include a dropdown to choose between Loud (-11dB LUFS), Normal (-14dB LUFS) and Quiet (-19dB LUFS) normalization options.

Alternatively, we could just outright normalize to a louder level and lower the global music volume by default, allowing players to turn the music volume up if they desire - similar to how (I believe) the master volume control works now?

It's also probably worth mentioning that some players have complained in the past when we've added volume reductions to lazer, as their computer setups aren't able to increase the output volume enough otherwise, for whatever reasons. Thus normalizing to a louder level and reducing via the global volume controls might be the preferable choice if we don't want players just reactionarily disabling normalization.

This might bring the risk of the mixed output going over 0dB and/or clipping, but ideally in the future everything will be running through a global compressor/limiter anyway to prevent this.

  • As stated in the new implementation's PR, a fallback volume reduction of 0.8 doesn't really match well with -14 LUFS. Turning normalization off results in your ears begging for mercy. A better value for the fallback volume reduction should probably be found.

I think it's probably fine for the volume to jump up when normalization is disabled - it's common behaviour with music players. If it's that much of a volume jump, we could maybe bump the global master or music volumes down by some amount when normalization is disabled to alleviate it somewhat, but still allow players to undo the reduction if they wish.

@smallketchup82
Copy link
Contributor Author

smallketchup82 commented May 22, 2024

it might be a bit weird for hitsounds to change volume between beatmaps

My thoughts on that is that hitsounds will likely be changing in volume regardless. As peppy said in our discussion in the osu!dev discord, mappers will typically try to psuedo-normalize hitsounds to the volume of the track themselves by setting every hitobject to a certain volume. I think that adjusting the volume offset for hitsounds regardless of whether beatmap hitsounds are on or off is probably the correct way to go about it, at least to me, since the goal is to retain the relationship between the track and the hitsounds, and only doing that in certain circumstances sounds counter-intuitive and unexpected. Though, it's probably a question that we'd want more opinions on.

Does this PR also remove lazer's current hard-coded global volume reduction?

Yes, but I don't think it removes the volume reduction when first starting the game from a fresh install.

I don't know how the rest of the team feels about adding more user preferences, but we could do something similar to what Spotify has and include a dropdown to choose between Loud (-11dB LUFS), Normal (-14dB LUFS) and Quiet (-19dB LUFS) normalization options.

When compared to the option of normalizing to a louder level but setting the default volume to a lower level, like 50%. I find myself siding more with the dropdown approach (mostly from personal preference, no real basis for it). But to be honest, either approach would work well and would be easy to implement. I'm pretty indecisive on this so I think it might be something that we'd want more opinions on, or to poll somewhere (in a discussion, osu!dev discord, idk).

if we don't want players just reactionarily disabling normalization.

Honestly, when its put that way, it kinda makes me want to reconsider adding the ability to enable or disable normalization. I believe that normalization should be on for most players unless there's a very good reason not to, since the pros of normalization outweigh the cons. I'm also sorta indecisive on this.

I think it's probably fine for the volume to jump up when normalization is disabled

After reading your thoughts on this, I think that I'll just keep it as is. While reducing the volume globally to get it to be close to -14 LUFS would be nice, I realized that players probably wouldn't like not being able to revert the normalization changes themselves (they would expect volume to be similar to before normalization was implemented if they disable normalization).

Thanks for your thoughts! I'd definitely like to get some more opinions on those concerns, minus the 3rd one cause I'm probably just going to leave it alone.

@smallketchup82
Copy link
Contributor Author

Resolved merge conflicts and fixed some things as a result of ppy/osu-framework#6233 getting rebased onto master. Tested everything and still works as expected.

this is honestly a nitpick on my behalf, but i figured it's better than
what we had before.

when i was looking at the setting in the settings menu, simulating a new
player, i asked myself "normalize audio? what audio?". for all the user
knows, this could be effects, beatmaps, hitsounds, anything.

this commit changes the text of the setting to read "Normalize Beatmap
Audio" to further describe that it's normalizing only beatmap audio
@smallketchup82
Copy link
Contributor Author

smallketchup82 commented Oct 17, 2024

foreach (BeatmapSetInfo beatmapSetInfo in r.All<BeatmapSetInfo>().Where(b => !b.DeletePending))

Currently this does not check if the audio file plays correctly, so if the user has any beatmaps with a broken audio file, this will attempt to process them, fail, then notify the user of the failure. This happens on every startup.

It can easily become annoying but the fix is painful. You'd have to attempt to open a bass stream for every audio file, check if that happens without errors, close that stream, then proceed with the processing. You'd need to do this for every unprocessed beatmap, on every startup. The performance of the loop would be greatly reduced, and the complexity the solution is an even bigger problem than that.

I didn't do anything back then, and I'll probably still leave it to be addressed in review. Though I might revisit that loop later on, because it can really use a refactor.

I mean, as bad as it is, it's honestly a decent (yet very annoying) reminder to the user to clean out beatmap sets that they quite literally cannot play 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Standardise beatmap track audio levels
9 participants