Tree-sync friendly lookup sync tests #8592

dapplion · 2025-12-16T05:16:51Z

Issue Addressed

Step 0 of the tree-sync roadmap Tree sync #7678

Current lookup sync tests are written in an explicit way that assume how the internals of lookup sync work. For example the test would do:

Emit unknown block parent message
Expect block request for X
Respond with successful block request
Expect block processing request for X
Response with successful processing request
etc..

This is unnecessarily verbose. And it will requires a complete re-write when something changes in the internals of lookup sync (has happened a few times, mostly for deneb and fulu).

What we really want to assert is:

WHEN: we receive an unknown block parent message
THEN: Lookup sync can sync that block
ASSERT: Without penalizing peers, without unnecessary retries

Proposed Changes

Keep all existing tests and add new cases but written in the new style described above. The logic to serve and respond to request is in this function fn simulate https://github.com/dapplion/lighthouse/blob/2288a3aeb11164bb1960dc803f41696c984c69ff/beacon_node/network/src/sync/tests/lookups.rs#L301

It controls peer behavior based on a CompleteStrategy where you can set for example "respond to BlocksByRoot requests with empty"
It actually runs beacon processor messages running their clousures. Now sync tests actually import blocks, increasing the test coverage to the interaction of sync and the da_checker.
To achieve the above the tests create real blocks with the test harness. To make the tests as fast as before, I disabled crypto with TestConfig

Along the way I found a couple bugs, which I documented on the diff.

Review guide

Look at lighthouse/beacon_node/network/src/sync/tests/lookups.rs directly (no diff).

Other changes are very minor and should not affect production paths

dapplion · 2025-12-16T05:17:54Z

beacon_node/network/src/sync/block_lookups/mod.rs

+            // removed from the da_checker. Note that ALL components are removed from the da_checker
+            // so when we re-download and process the block we get the error
+            // MissingComponentsAfterAllProcessed and get stuck.
+            lookup.reset_requests();


Bug found on testing, lookups may be stuck given this sequence of events

dapplion · 2025-12-16T05:19:29Z

beacon_node/network/src/sync/manager.rs

+        // sending retry requests to the disconnecting peer.
+        for sync_request_id in self.network.peer_disconnected(peer_id) {
+            self.inject_error(*peer_id, sync_request_id, RPCError::Disconnected);
+        }


Minor bug, we need to remove the peer from the sync states (e.g. self.block_lookups) then inject the disconnect events. Otherwise we may send requests to peers that are already disconnected. I don't think there's risk of sync getting stuck if libp2p rejects sending messages to disconnected peers, but deserves a fix anyway.

mergify · 2025-12-16T06:47:21Z

This pull request has merge conflicts. Could you please resolve them @dapplion? 🙏

jimmygchen · 2026-01-08T03:28:55Z

beacon_node/beacon_chain/src/test_utils.rs

+            .get_blinded_block(block_root)
+            .unwrap()
+            .unwrap_or_else(|| {
+                panic!("block root does not exist in external harness {block_root:?}")


This isn't always "external" harness

Suggested change

panic!("block root does not exist in external harness {block_root:?}")

panic!("block root does not exist in harness {block_root:?}")

jimmygchen · 2026-01-08T03:32:59Z

beacon_node/network/src/sync/block_lookups/mod.rs


 #[cfg(test)]
+#[derive(Debug)]
 /// Tuple of `SingleLookupId`, requested block root, awaiting parent block root (if any),


Please update doc - no longer a tuple and awaiting parent block removed.

jimmygchen · 2026-01-08T04:30:10Z

beacon_node/network/Cargo.toml

 k256 = "0.13.4"
 kzg = { workspace = true }
 matches = "0.1.8"
+paste = "1.0.15"


Suggested change

paste = "1.0.15"

paste = { workspace = true }

jimmygchen · 2026-01-08T04:35:22Z

Makefile

 RECENT_FORKS_BEFORE_GLOAS=electra fulu

 # List of all recent hard forks. This list is used to set env variables for http_api tests
+# Include phase0 to test the code paths in sync that are pre blobs


We already have nightly-tests that runs prior fork tests
#8319

But i just realised it hasn't been activated on the sigp fork because github only run scheduled workflows from the main branch (stable), we can either wait until the release or have a separate PR to stable to activate this.

Made a PR to activate these nightly tests:
#8636

Could we keep this for network tests only? It's just one extra fork and makes it easy to debug and catch errors. For sync tests we should keep the forks that add new objects like run only

phase0, deneb, fulu

Makefile

jimmygchen · 2026-01-08T05:03:47Z

beacon_node/network/src/sync/tests/mod.rs

    /// Beacon chain harness
    harness: BeaconChainHarness<EphemeralHarnessType<E>>,
+    /// External beacon chain harness to produce blocks that are not imported
+    external_harness: BeaconChainHarness<EphemeralHarnessType<E>>,


Any reason to have this as a field on TestRig? I see that it's only used in build_chain

Good find! Moved to build_chain

jimmygchen · 2026-01-08T05:31:38Z

beacon_node/network/src/sync/manager.rs


+        // Inject a Disconnected error on all requests associated with the disconnected peer
+        // to retry all batches/lookups. Only after removing the peer from the data structures to
+        // sending retry requests to the disconnecting peer.


Missing word i think

Suggested change

// sending retry requests to the disconnecting peer.

// avoid sending retry requests to the disconnecting peer.

mergify · 2026-01-09T15:53:49Z

Some required checks have failed. Could you please take a look @dapplion? 🙏

jimmygchen · 2026-01-12T06:32:01Z

beacon_node/network/src/sync/tests/lookups.rs

-            None
+    // Network / external peers simulated behaviour
+
+    async fn simulate(&mut self, complete_strategy: CompleteStrategy) {


Would be good to add some brief docs here on what it does and how to use this.

The names simulate and CompleteStrategy are a bit generic. I thought about NetworkResponseStrategy or PeerBehaviourStrategy but realised the struct also has non peer behaviour related configuration, like process_result_conditional and block_imported_while_processing. I can't think of a better name right now though.

Maybe SimulateConfig?

jimmygchen · 2026-01-12T06:35:42Z

beacon_node/network/src/sync/tests/lookups.rs

+        self
+    }
+
+    fn process_result<F>(mut self, f: F) -> Self


I think we can improve this name to make it obvious this is implementing builder pattern, rather than actually processing results, could use the standard with_ prefix.

jimmygchen · 2026-01-12T06:35:51Z

beacon_node/network/src/sync/tests/lookups.rs

+        self
+    }
+
+    fn block_imported_while_processing(mut self, block_root: Hash256) -> Self {


same as above

jimmygchen · 2026-01-14T04:36:22Z

beacon_node/beacon_chain/src/test_utils.rs

-                                &self.validator_keypairs[*validator_index].sk.sign(message),
-                            );
+                            // If disable_crypto is true keep the attestation signature as infinity
+                            if self.chain.config.test_config.disable_crypto {


I think we're missing a ! here?

jimmygchen · 2026-01-14T04:36:39Z

beacon_node/beacon_chain/src/test_utils.rs

-                                &self.validator_keypairs[*validator_index].sk.sign(message),
-                            );
+                            // If disable_crypto is true keep the attestation signature as infinity
+                            if self.chain.config.test_config.disable_crypto {


missing a ! here?

jimmygchen · 2026-01-14T04:54:11Z

beacon_node/network/src/sync/tests/mod.rs

+    /// Lookup's Id
+    id: Id,
+    block_root: Hash256,
+    max_seen_peers: HashSet<PeerId>,


i think max_ isn't necessary here, or does it mean something other than seen peers?

jimmygchen · 2026-01-14T05:03:50Z

beacon_node/network/src/sync/tests/lookups.rs

-    // Trigger the request
-    rig.trigger_unknown_block_from_attestation(block_hash, peer_id);
-    let id = rig.expect_block_lookup_request(block_hash);
+/// Assert that sync completes from a GossipUnknownParentBlob / UknownDataColumnParent


typo

Suggested change

/// Assert that sync completes from a GossipUnknownParentBlob / UknownDataColumnParent

/// Assert that sync completes from a GossipUnknownParentBlob / UnknownDataColumnParent

jimmygchen · 2026-01-14T05:06:12Z

beacon_node/network/src/sync/tests/lookups.rs

+/// Test added in https://github.com/sigp/lighthouse/commit/84c7d8cc7006a6f1f1bb5729ab222b9f85f72727
+/// TODO: This test was added on a very old version of lookup sync. It's unclear if the situation
+/// it wants to recreate is possible or problematic in current code. Skipping.
+#[ignore]


Is this test failing?

jimmygchen · 2026-01-14T05:07:05Z

beacon_node/network/src/sync/tests/lookups.rs

-            chain_hash,
-            BlockProcessingResult::Ok(AvailabilityProcessingStatus::Imported(chain_hash)),
-        );
+    fn reset_metrics(&mut self) {


reset_metrics seems a bit misleading here, perhaps something like capture_metrics_baseline is more suitable?

jimmygchen · 2026-01-14T05:11:11Z

beacon_node/network/src/sync/tests/lookups.rs

-        let id = self.find_single_lookup_for(self.find_oldest_parent_lookup(chain_hash));
-        self.single_blob_component_processed(id, result);
+    fn completed_lookups(&self) -> usize {
+        // Substract initial value to allow resetting metrics mid test


typo

Suggested change

// Substract initial value to allow resetting metrics mid test

// Subtract initial value to allow resetting metrics mid test

jimmygchen · 2026-01-14T05:17:27Z

beacon_node/network/src/sync/tests/lookups.rs

+                self.sync_rx_queue.push(ev);
+            }
+
+            // Choose at random which queue to process first


I don't fully understand the purpose of deterministically choosing a random queue to process?

jimmygchen

Hey @dapplion

I've done a round of review and have added a few comments. There's a logic error in the condition and the CI failures that needs to be fixed.

Overall I think the approach is good - offloading complexity to TestRig makes individual tests cleaner. I'm slightly concerned about the growing complexity of TestRig itself and how easy it would be to extend it in the future, but I guess this may be needed as sync itself is inheritlenty complex.

- Fix logic error: add missing `!` in attestation signing conditions - Fix typos: Substract -> Subtract, UknownDataColumnParent -> UnknownDataColumnParent - Rename reset_metrics to capture_metrics_baseline - Rename max_seen_peers to seen_peers - Use with_ prefix for builder methods (with_process_result, with_block_imported_while_processing) - Add documentation to simulate fn and CompleteStrategy

dapplion added 16 commits December 3, 2025 14:55

WIP Add more sync tests

fc465f7

Add more sync tests

88a86ff

Add blobs and columns

6b08ca3

Disable crypto programatically

0fe9c68

Fix more tests

89cab8f

Fix bug with mock execution engine

9822b73

Fix tests

3b4ddc6

Fix remaining tests

c5b9457

Add more fulu tests

fc294a5

Merge remote-tracking branch 'sigp/unstable' into more-sync-tests

10aa052

Lint full

013476d

Enable ignored test

bee1abd

Allow to reorder event order

d39253a

Tests passing

51a55e4

Lint

1755dfd

Run phase0 only for network tests

2288a3a

dapplion assigned jimmygchen Dec 16, 2025

dapplion requested a review from jxs as a code owner December 16, 2025 05:16

dapplion added ready-for-review The code is ready for review syncing labels Dec 16, 2025

dapplion commented Dec 16, 2025

View reviewed changes

Reduce diff

d48dcf2

michaelsproul added the test improvement Improve tests label Dec 16, 2025

mergify bot added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Dec 16, 2025

dapplion added 3 commits December 16, 2025 15:05

Reduce diff

38bdf2d

Merge remote-tracking branch 'sigp/unstable' into more-sync-tests

b6fc8a9

Use FIFO

73099eb

jimmygchen reviewed Jan 8, 2026

View reviewed changes

Makefile Show resolved Hide resolved

jimmygchen reviewed Jan 8, 2026

View reviewed changes

Apply PR review

8234cab

mergify bot added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Jan 9, 2026

jimmygchen reviewed Jan 12, 2026

View reviewed changes

jimmygchen reviewed Jan 14, 2026

View reviewed changes

jimmygchen requested changes Jan 14, 2026

View reviewed changes

dapplion added 7 commits January 14, 2026 19:28

Use features to disable crypto

ff83f43

Reduce diff

d2eaf15

Reduce diff

fc2b226

Fix fake crypto usage

24569a3

Add crypto on test

9354fa0

Fix crypto

a7a36e3

	panic!("block root does not exist in external harness {block_root:?}")
	panic!("block root does not exist in harness {block_root:?}")

	// sending retry requests to the disconnecting peer.
	// avoid sending retry requests to the disconnecting peer.

	/// Assert that sync completes from a GossipUnknownParentBlob / UknownDataColumnParent
	/// Assert that sync completes from a GossipUnknownParentBlob / UnknownDataColumnParent

	// Substract initial value to allow resetting metrics mid test
	// Subtract initial value to allow resetting metrics mid test

Tree-sync friendly lookup sync tests #8592

Are you sure you want to change the base?

Tree-sync friendly lookup sync tests #8592

Conversation

dapplion commented Dec 16, 2025

Issue Addressed

Proposed Changes

Review guide

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Dec 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jimmygchen Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Jan 9, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jimmygchen left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jimmygchen Jan 8, 2026 •

edited

Loading