sc-client-db: add prefetched indexed transactions support#12086
Draft
skunert wants to merge 21 commits into
Draft
sc-client-db: add prefetched indexed transactions support#12086skunert wants to merge 21 commits into
skunert wants to merge 21 commits into
Conversation
Substrate plumbing required by the cumulus storage-chain-sync block-import wrapper: - sp-database: add multibump (atomic refcount delta) for the indexed-transaction column. - sc-client-db: classify_indexed_extrinsics + IndexedTransactionMeta + apply_index_ops Renew/Insert refcount accounting; surface PREFETCHED_INDEXED_TRANSACTIONS_INTERMEDIATE_KEY and forward prefetched bytes through apply_block. - sc-client-api: re-export the prefetched-transactions intermediate key alongside backend traits. - sc-network/bitswap: gate Prefix + schema re-exports under a new 'test-helpers' feature for downstream test crates; widen Prefix::to_bytes to pub under the feature. - sc-service: wire the prefetched-transactions intermediate through the client's apply_block pipeline. - sp-transaction-storage-proof: add IndexedTransactionInfo + TransactionStorageApi v2 returning per-entry (content_hash, hashing, codec, extrinsic_index).
litep2p's bitswap response decoder reconstructs the response CID by re-hashing block.data with the algorithm declared in the wire prefix (see /home/sebastian/.cargo/registry/src/.../litep2p-0.13.3/src/protocol/libp2p/bitswap/mod.rs:248). For consumers of `request_bitswap_blocks_unverified` that don't know the hash function up front, this re-hashing produces a CID that does not equal the wantlist CID and PendingBatch.record_responses (CID-keyed HashMap lookup) silently dropped every such response. The storage-chain sync fetcher uses placeholder Blake CIDs on the unverified path because the hashing algorithm is only known after the bytes are fetched and locally hashed against each supported algorithm. With CID-keyed matching, every Sha2-256 or Keccak-256 response was filtered out at the litep2p shim layer, causing `request_bitswap_blocks_unverified` to time out and the wrapper to retry forever. Switch PendingBatch's response store from `HashMap<Cid, ResponseType>` to `Vec<ResponseType>` accumulated in arrival order. is_complete() resolves when count == wantlist size; the substrate-side classify_response and classify_response_unverified already handle CID matching (verified path) or positional + prefix matching (unverified path) once the raw responses reach them. Drop the now-unused select_best_response_per_cid helper and its 3 unit tests. Add 4 regression tests (pending_batch_accepts_sha2_response_for_blake_wantlist, pending_batch_accepts_keccak_response_for_blake_wantlist, pending_batch_preserves_response_order_across_algorithms, pending_batch_caps_responses_at_wantlist_size) that fail on the old CID-keyed implementation and pass on the new positional one.
…ions Adds 42 tests in sc-client-db (mod indexed_transaction_backend_tests) covering ref-counting lifecycles and the storage-chain prefetched-renew path against three backends: - kvdb-memorydb (existing test backend) - kvdb-rocksdb (real on-disk via tempdir) - parity-db (real on-disk via tempdir) Layer A (21 tests): adapter-level scenarios calling sp_database::Database directly. Drop+reopen between phases for paritydb/rocksdb to bypass commit_overlay caching and observe true on-disk state. Layer B (21 tests): Backend-level scenarios exercising apply_index_ops and prune lifecycles. Mix of existing prefetched_* scenarios reproduced parametrically plus new scenarios for redundant-prefetch, same-block insert+renew, and sequential prefetched renews. Result: 36 pass, 6 deterministically/flakily fail — all 6 failures on parity-db, documenting: - adapter-level data loss for same-commit store+reference on a fresh hash - refcount leaks through full block lifecycle in redundant-prefetch and same-block insert+renew scenarios New public test helper Backend::new_test_with_tx_storage_source accepts a caller-supplied DatabaseSource so tests can target real backends.
Collapses the 42-test backend matrix from three explicit #[test] wrappers
per scenario into a single #[rstest] fn with three #[case::<name>(...)]
attributes per scenario. Net ~150 LOC reduction.
- New dev-dep: rstest = { workspace = true } (already pinned at workspace
root, used widely across substrate/cumulus/polkadot).
- BackendKind enum + DbFactory and BackendFactory consolidate the three
per-backend run_with_* helpers into one parametric constructor.
- Test names move from suffix style (a1_store_then_get_paritydb) to
rstest namespace (a1_store_then_get::case_2_paritydb). Filterable via
cargo test ::case_2_paritydb for parity-db variants only.
Pass/fail signature unchanged: same 4 deterministic + 2 flaky parity-db
failures, all other backends green.
…lockImportParams field
Contributor
Author
|
/cmd fmt |
Replaces the bare `Vec<([u8; 32], Vec<u8>)>` shape of
`BlockImportParams::prefetched_indexed_transactions` and
`BlockImportOperation::set_prefetched_indexed_transactions` with a dedicated
`PrefetchedIndexedTransactions { ops, renew_payloads }` struct.
- `renew_payloads` retains the existing semantics: `(content_hash, bytes)`
pairs the backend consumes inside `apply_index_ops` when servicing a
runtime-produced `IndexOperation::Renew` whose hash is not yet in the local
TRANSACTION column. No behavioural change for current tip-sync consumers.
- `ops` is new. It carries synthetic `IndexOperation`s that the backend uses
ONLY when the runtime did not call `update_transaction_index` (i.e.
`operation.index_ops` is empty). This is the substrate-side prerequisite for
cumulus storage-chain gap-sync backfill: the wrapper will derive synthetic
Insert ops for store extrinsics (tail bytes already in the body) and
synthetic Renew ops for entries it bitswap-fetches; the backend's
`try_commit_operation` then picks `effective_ops` with a strict precedence
rule -- runtime-produced ops always win when present, synthetic ops are a
pure fallback. No cumulus consumer enabled yet; gating remains in the
upstream cumulus wrapper's `should_intercept` filter.
Includes 8 new sc-client-db unit tests covering the precedence rule:
- `synthetic_ops_take_effect_when_runtime_index_ops_empty`
- `runtime_index_ops_win_over_synthetic`
- `empty_both_falls_back_to_plain_body`
- `synthetic_renew_uses_prefetched_payload`
- `synthetic_renew_without_prefetched_references_existing`
- `synthetic_insert_extracts_tail_from_body`
- `synthetic_insert_oversized_size_falls_back_to_full_extrinsic`
- `multiple_synthetic_ops_per_block_apply_in_order`
The existing 39-test indexed-transaction backend matrix continues to pass.
The test helper `insert_block_with_prefetched` keeps its existing signature
(thin wrapper around `set_prefetched_indexed_transactions` with empty ops).
A new helper `insert_block_with_synthetic_ops` exercises both paths.
|
All GitHub workflows were cancelled due to failure one of the required jobs. |
- prdoc: collapse verbose prose, keep functional descriptions - backend.rs: tighten struct/field/method comments - lib.rs: reduce comment noise in production code and tests
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We need to pass prefetched indexed transaction data from the block import pipeline to the backend so it can be inserted into the
TRANSACTIONcolumn. The flow will include a custom block import that downloads referenced data over the network, the attaches it to the BlockImportParameters. Backend then inserts them into the DB.