Skip to content

Conversation

@Purfview
Copy link
Contributor

@Purfview Purfview commented Dec 25, 2025

VAD Fix: explicit safe context (no view mutation).
Reduced SileroVADModel RAM usage by 9% [on 4h audio].

PR doesn't affect VAD speed nor probs output in current FW VAD functionality.
Fixes inconsistent outputs with multiple model calls [such functionality is currently not implemented].

VAD Fix: explicit safe context (no view mutation).
Reduced VAD RAM usage.
@MahmoudAshraf97
Copy link
Collaborator

Hi, I get that this may modify the array in place, but I don't understand how that affects multiple model calls, do you mean consecutive calls on the same audio array or multiple calls in parallel for the same or different audios?

@Purfview
Copy link
Contributor Author

Purfview commented Dec 28, 2025

do you mean consecutive calls on the same audio array or multiple calls in parallel for the same or different audios?

As I remember it affected both consecutive and parallel calls with different size audios.

Not sure now, I think it affected same size audios too but only(?) the last prob was affected.

@Purfview
Copy link
Contributor Author

Purfview commented Dec 29, 2025

Analyzed probs from batched VAD [with 100% overlap] vs non-batched

Pre patch stats:

Total probs             : 224965
Total diff probs        : 112396
Above > 0.00001 diffs   : 641
Above > 0.0001 diffs    : 210
Above > 0.001 diffs     : 54
Above > 0.01 diffs      : 0

After the patch:

Total probs             : 224965
Total diff probs        : 97207
Above > 0.00001 diffs   : 0
Above > 0.0001 diffs    : 0
Above > 0.001 diffs     : 0
Above > 0.01 diffs      : 0 

Off-topic:
I'm leaning towards small hardcoded overlap, need to do more analysis, then I'll make a PR for batched VAD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants