Skip to content

Conversation

@HugoDutchie
Copy link

Complementary fix to #92. Extends prune backlog detection to additional components that can fall behind during extended sync periods.

Changes

db/state/aggregator.go

  • Add CommitmentBacklogInfo() to detect CommitmentDomain history pruning backlog

execution/stagedsync/exec3.go

  • Trigger GreedyPruneHistory when CommitmentDomain is >10M txNums behind
  • Add mode transition logging for visibility

execution/stagedsync/stage_execute.go

  • Add ChangeSets3 backlog detection with aggressive/medium modes:
    • >1M blocks behind → aggressive (unlimited, 5min timeout)
    • >100K blocks behind → medium (100K limit, 5min timeout)
  • Reduce prune timeouts from hours to 5 minutes for faster catchup
  • Promote prune timing logs from Debug to Info

Testing

Tested on Polygon mainnet node recovering from prune backlog.
MDBX.dat now seems to be stable and only growing small steps every once and a while. Not affecting node being able to stay on chaintip.

@HugoDutchie
Copy link
Author

any updates on review? for me not a problem but it seems the initial beta release was not sufficient to solve the problem for other people.

Its stable on my node for over a week now (with this pr code included)

@pratikspatil024
Copy link
Member

Hey @HugoDutchie - sorry for the delay, people from our team will review it shortly.

@HugoDutchie
Copy link
Author

I actually notice there is still a problem with mdbx growth and node slowing down. It now just takes 10+ days to show trough instead of the earlier 2-5 days. This seems like an issue in how erigon handles mdbx cleanup and potentially its internal structure. It also seems a problem seen by other clients using mdbx. I don't know where to report this or if there are other investigations going. Seems like something for upstream erigon team to look into, if not already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants