-
Notifications
You must be signed in to change notification settings - Fork 4.1k
perf: optimise staking endblocker #25486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Just out of curiosity — do we know roughly how many entries were being iterated in those queues ( Also, were the before/after measurements (1109 → 1.43) taken on the same state size? |
For the chain i tested on in particular, there were zero entries, but the sheer number of files it has to go through (as i tested it on an archive testnet node) when getting iterator resulted in the huge time taken. And yes the benchmarks were in milliseconds. |
Interesting — I didn’t know the staking endblocker could take that much time on an archive node. I haven’t fully reviewed the code yet, but the direction looks great to me. One potential downside I can think of is if the cached data becomes dirty — the performance gain is clear when the cached state remains consistent. |
yeap, haven't fully tested with a dirty cache but I believe it is safe to assume that the performance gain should still be significant because the cache is in memory and the cache would eventually be cleared because these redelegations/undelegations/unbonding would eventually reach maturity |
Description
This PR optimises the staking endblocker with a non-breaking change.
It was discovered that on RocksDB/VersionDB nodes, the staking endblocker could take up to 1100ms on archival nodes, causing them to consistently lag behind. This is the case even when there were no entries picked up by the iterator. As such, there is an urgent need to improve block sync performance.
Root Cause
ValidatorQueueIteratorUBDQueueIteratorRedelegationQueueIteratorChanges Made
Instead of scanning the database from the beginning of time to the latest block height or timestamp on every block, an in-memory cache now stores these entries, significantly reducing I/O. The iterator is invoked only once during cache initialization when the node starts.
Results
With telemetry metrics enabled, we observed a significant performance improvement after these optimisations were applied.
Before:

After:
