Skip to content

core: back-up to kvdb for a pruned block #31638

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

s1na
Copy link
Contributor

@s1na s1na commented Apr 14, 2025

This is an attempt at fixing #31601. I think what happens is the startup logic will try to get the full block body (it's bc.loadLastState) and fail because genesis block has been pruned from the freezer. This will cause it to keep repeating the reset logic, causing a deadlock.

This can happen when due to an unsuccessful sync we don't have the state for the head (or any other state) fully, and try to redo the snap sync.

// The freezer might be pruned. In the particular case of genesis, the block
// will be still available in kvdb. The full genesis block is needed on startup
// sometimes for repair.
if data == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we only specialize the genesis block?

@s1na
Copy link
Contributor Author

s1na commented Apr 14, 2025

The alternative would be to relax the condition in loadLastState to allow only the header to exist when we are at genesis block, like so:

        var headBlock *types.Block
        if headBlockNum == 0 {
            header := bc.GetHeaderByHash(head)
            headBlock = types.NewBlockWithHeader(header)
        } else {
	    headBlock := bc.GetBlockByHash(head)
	}
	if headBlock == nil {
		// Corrupt or empty database, init from scratch
		log.Warn("Head block missing, resetting chain", "hash", head)
		return bc.Reset()
	}
	
	```

@fjl fjl added this to the 1.15.9 milestone Apr 15, 2025
@s1na s1na marked this pull request as ready for review April 16, 2025 14:03
@s1na
Copy link
Contributor Author

s1na commented Apr 16, 2025

I was trying to test this patch and hit this panic:

INFO [04-16|15:29:27.414] Loaded most recent local snap block      number=3,428,602 hash=e71201..025ec2 age=1y11mo3w
INFO [04-16|15:29:27.414] Loaded last snap-sync pivot marker       number=8,131,055
INFO [04-16|15:29:27.414] Genesis state is missing, wait state sync
WARN [04-16|15:29:27.414] Snapshot maintenance disabled (syncing)
INFO [04-16|15:29:27.414] Initialized transaction indexer          range="last 135000 blocks"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xaadcb9]

goroutine 54 [running]:
github.com/ethereum/go-ethereum/core/types.(*Block).NumberU64(...)
        github.com/ethereum/go-ethereum/core/types/block.go:392
github.com/ethereum/go-ethereum/core.(*txIndexer).loop(0xc0003b53c0, 0xc000429408)
        github.com/ethereum/go-ethereum/core/txindexer.go:208 +0x99
created by github.com/ethereum/go-ethereum/core.newTxIndexer in goroutine 1
        github.com/ethereum/go-ethereum/core/txindexer.go:70 +0x146

@s1na
Copy link
Contributor Author

s1na commented Apr 16, 2025

Ok after fixing this I am now getting this repeated in my logs and it doesn't proceed:

WARN [04-16|16:01:13.097] Rewinding blockchain to block            target=3,428,589
INFO [04-16|16:01:13.142] Loaded most recent local header          number=3,428,589 hash=0ff518..40de0d age=1y11mo3w
INFO [04-16|16:01:13.142] Loaded most recent local block           number=0         hash=25a5cc..3e6dd9 age=3y7mo1d
INFO [04-16|16:01:13.142] Loaded most recent local snap block      number=3,428,589 hash=0ff518..40de0d age=1y11mo3w
INFO [04-16|16:01:13.142] Loaded last snap-sync pivot marker       number=8,131,277
ERROR[04-16|16:01:13.142] Current block not found in database      block=0 hash=25a5cc..3e6dd9
ERROR[04-16|16:01:13.142] Beacon backfilling failed                err="current block missing: #0 [25a5cc10..]"

@s1na
Copy link
Contributor Author

s1na commented Apr 16, 2025

After adding the exception in SetHead:

INFO [04-16|16:12:05.492] Forkchoice requested sync to new head    number=8,131,372 hash=bc4275..c5b543 finalized=unknown
INFO [04-16|16:12:05.576] Syncing beacon headers                   downloaded=512 left=0 eta=0s
ERROR[04-16|16:12:05.576] Latest filled block is not available
INFO [04-16|16:12:05.577] Block synchronisation started
ERROR[04-16|16:12:05.577] Reject duplicated disable operation
WARN [04-16|16:12:05.590] Rewinding blockchain to block            target=3,428,563
INFO [04-16|16:12:05.626] Loaded most recent local header          number=3,428,563 hash=9513d0..81c1cb age=1y11mo3w
INFO [04-16|16:12:05.626] Loaded most recent local block           number=0         hash=25a5cc..3e6dd9 age=3y7mo1d
INFO [04-16|16:12:05.627] Loaded most recent local snap block      number=3,428,563 hash=9513d0..81c1cb age=1y11mo3w
INFO [04-16|16:12:05.627] Loaded last snap-sync pivot marker       number=8,131,308
INFO [04-16|16:12:05.627] Truncated excess ancient chain segment   oldhead=3,428,564 newhead=3,428,563
CRIT [04-16|16:12:05.627] Failed to reset txpool state             err="missing trie node 5eb6e371a698b8d68f665192350ffcecbbbf322916f4b51bd79bb6887da3f494 (path ) state 0x5eb6e371a698b8d68f665192350ffcecbbbf322916f4b51bd79bb6887da3f494 is not available"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants