-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Fix event buffer memory leak #2160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix event buffer memory leak #2160
Conversation
|
Thanks for opening this pull request and contributing to the project! The next step is for the maintainers to review your changes. If everything looks good, it will be approved and merged into the main branch. In the meantime, anyone in the community is encouraged to test this pull request and provide feedback. ✅ How to confirm it worksIf you’ve tested this PR, please comment below with: This helps us speed up the review and merge process. 📦 To test this PR locally:If you encounter any issues or have feedback, feel free to comment as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a critical memory leak in the event buffer system that caused the process to reach ~2GB RAM and crash during history sync operations. The root cause was the creation of thousands of individual setTimeout timers that retained message data in closures, preventing garbage collection.
Key Changes:
- Added debounced flush timeout mechanism to ensure only one pending flush timer exists at a time
- Replaced custom mutex implementation with the battle-tested
async-mutexlibrary - Added reference counting to keyed mutex for automatic cleanup
- Minor code formatting improvement for operator precedence clarity
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/Utils/event-buffer.ts | Implements the core memory leak fix by debouncing the flush timeout, ensuring only a single pending timer exists instead of thousands during high-volume message processing |
| src/Utils/make-mutex.ts | Refactors mutex implementation to use async-mutex library directly, simplifying code and adding automatic cleanup via reference counting for keyed mutexes |
| src/Socket/messages-recv.ts | Adds explicit parentheses around await expression for improved code clarity (cosmetic change only) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
7f3df2b to
5b489d6
Compare
|
Interesting finding. Good job mister |
purpshell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work, this seems to work better. Always suspected something in the flushing mechanism of the buffered event system (I also hate this system and I will likely remove it entirely from baileys). A common workaround before was to do ev.flush(), but no one dug deeper to see why the buffer wasn't flushing.
Yes, there are still some memory leaks, but these commits will help reduce memory and CPU usage drastically. |
|
I think the remaining memory leaks are also related to this commit #ae0cb89 |
|
@YonkoSam Please fix the lint. |
It might also be related to media submissions, but I remember a PR address addressing this; I don't know if later commits reversed that issue. |
|
This PR is stale because it has been open for 14 days with no activity. Remove the stale label or comment or this will be closed in 14 days |
We've been investigating a memory leak that causes the process to hit ~2GB RAM and crash during the initial history sync. We traced the regression back to commit ae0cb89.
The Issue:
In
createBufferedFunction, thefinallyblock schedules asetTimeout(flush, 100)for every single execution. When thousands of messages come in rapidly (like during a history sync), this creates thousands of individual timers.Because these timers are created within the function scope, they create a closure that retains the function arguments (the heavy message data). This prevents the Garbage Collector from freeing the message data until the timer actually fires. Essentially, we are holding onto thousands of messages in RAM simultaneously because of the pending timers.
The Fix:
Debounced the flush: Modified the logic to ensure we only schedule a single pending flush timeout at a time (
flushPendingTimeout). This allows the GC to clean up the message data immediately after execution.Tested locally with heavy history syncs and memory usage is now stable.