Change full-sync to download headers from latest to genesis

Related to #2126

After the merge, clients are supposed to request headers from the latest one back to genesis. This avoids problems with reorgs or long-lived side-chains near the tip.
It also solves a DoS vector: since generating a new valid block no longer requires providing a proof-of-work, byzantine nodes could potentially give us a long chain that doesn't connect to the current latest block our consensus client gave us. By syncing latest-block first, we can easily find this block doesn't connect to our latest.

Full-sync logic currently:

1. Fetches the current head from the DB
2. Send headers request to peers (with retry and timeout), from current head ascending.
    1. If fails, returns
    2. If response is empty, retry.
3. Set current head to head’s parent if we detect a reorg (#2126).
    1. Note that this doesn’t modify the DB, so any network failure will reset the head.
4. Update the current head with the latest downloaded header.
5. `process_incoming_headers`: request block bodies, execute batch, and store in the DB.
6. If sync head found, break, else loop back to 2.

What we want:

1. Try to fetch the sync-head from the DB, and use that as our initial "tail" (the lowest header we have that connects to our sync-head).
2. Download headers from latest to genesis, in a loop:
    1. Check we don’t have the header in the DB first. If we do, update the current tail of the chain to that block's parent and `continue`. (*)
    3. Request headers from peers, descending from the current tail.
    4. Store headers directly in DB, skipping any known headers, and removing any data from side-chains we already stored, like account states or block numbers.
    5. If the chain connects to genesis, `break`.
3. Download and execute bodies from genesis to head. We can do this sequentially for the first version, and later switch to parallel downloads.

(*) Note that we'll need to add an invariant or some kind of metadata to mark a header as "we got all headers from here to genesis". This avoids the common case where we need to fetch a single block, so we fetch it, and then fetch its parent, its parent's parent, and so on, until we get all the way to genesis. For a first version, we could just have it in memory, but it should be easy to persist that information in the DB.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Change full-sync to download headers from latest to genesis #4717

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Change full-sync to download headers from latest to genesis #4717

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions