Skip to content

perf: increase end-to-end transfer TPS#99

Merged
dev-jodee merged 5 commits intomainfrom
feat/transfer-throughput-improvement
Apr 20, 2026
Merged

perf: increase end-to-end transfer TPS#99
dev-jodee merged 5 commits intomainfrom
feat/transfer-throughput-improvement

Conversation

@Huzaifa696
Copy link
Copy Markdown
Collaborator

@Huzaifa696 Huzaifa696 commented Apr 17, 2026

Summary

Raises sustained end-to-end TPS by adding a deadline-driven sequencer flush, parallelizing SVM execution over non-conflicting chunks, tightening dedup/sigverify under load, and collapsing the settler's per-row SQL writes into bulk upserts. The same knobs are exposed as env vars so operators can tune the pipeline without code changes. No behavioral changes outside perf + configuration.
Also rewrites the bench load loop as async.

Core

  • Sequencer: batch-flush to executor stage now triggered by a deadline in addition to size; stage metrics extended to cover the new flush paths.
  • Executor: SVM execution parallelized over non-conflicting chunked batches; an order-preserving merge_svm_outputs reassembles results and accumulates TransactionErrorMetrics / ExecuteTimings across chunks. Single-worker setting collapses back to the sequential path.
  • Pipeline hygiene (dedup / sigverify): blockhash ingestion is prioritized in dedup; fairness is ensured in sigverify's cloned receivers.
  • Settler / accounts: per-row account and transaction writes in write_batch.rs replaced with bulk upserts; settler select! made biased so shutdown and blocktime ticks win over result draining.
  • Postgres pool: PostgresAccountsDB now resolves pool size from config with a ceiling.

Configuration

New knobs: CONTRA_BATCH_DEADLINE_MS (sequencer flush deadline), CONTRA_MAX_SVM_WORKERS (executor parallelism), CONTRA_PG_MAX_CONNECTIONS (pool size, default 32, ceiling 256) — mirrored in .env.* and docker-compose*.yml.

bench-tps

  • Load phase rewritten as async (tokio mpsc + join_all) so a single driver saturates the RPC endpoint instead of blocking per-tx.
  • args.rs / main.rs wire the new async flow; run.sh and .env.sample(.devnet) pick up the new knobs.

Infra

  • docker-compose.yml and docker-compose.devnet.yml raise container nofile soft/hard limits to 65536 and mirror the new env vars.

Result

TPS Before: ~1600
TPS After: ~7000
Screenshot from 2026-04-18 04-58-35

Coverage Report

Component Lines Hit Lines Total Coverage Artifact
Core 7,993 9,450 84.6% rust-unit-coverage-reports
Indexer 13,462 15,715 85.7% rust-unit-coverage-reports
Gateway 952 1,076 88.5% rust-unit-coverage-reports
Auth 541 596 90.8% rust-unit-coverage-reports
Withdraw Program 118 230 51.3% unit-coverage-reports
Escrow Program 1,170 1,951 60.0% unit-coverage-reports
E2E Integration 8,049 11,853 67.9% e2e-coverage-reports
Total 32,285 40,871 79.0%

Last updated: 2026-04-18 00:16:26 UTC by Withdraw Program

@Huzaifa696 Huzaifa696 self-assigned this Apr 17, 2026
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 17, 2026

Greptile Summary

This PR delivers three interconnected throughput improvements: (1) rewrites the bench-tps load phase as fully async using tokio mpsc + join_all per-batch concurrency, (2) replaces per-row SQL writes in the settler with bulk UNNEST upserts that collapse hundreds of round-trips into 2–3 queries per slot, and (3) adds optional intra-batch parallel SVM execution via std::thread::scope workers. New env vars expose CONTRA_PG_MAX_CONNECTIONS, CONTRA_BATCH_DEADLINE_MS, CONTRA_BATCH_CHANNEL_CAPACITY, and CONTRA_MAX_SVM_WORKERS as tuning knobs; tokio-mpmc is replaced throughout with async-channel.

  • The parallel SVM path in execution.rs calls the blocking execute_parallel (which uses std::thread::scope) directly from an async fn that runs on a tokio worker thread. Without tokio::task::block_in_place, this occupies the worker thread for the full SVM execution window and prevents the runtime from scheduling other async tasks on it during that time.

Confidence Score: 4/5

Safe to merge after addressing the block_in_place gap in the parallel SVM path; all other changes are well-structured and well-tested.

One P1 finding: execute_parallel blocks a tokio worker thread via std::thread::scope without block_in_place, which can starve the async pipeline under sustained load. The fix is a one-line change. All remaining findings are P2 style/robustness suggestions. The bulk of the PR is solid.

core/src/stages/execution.rs — the execute_parallel call site in execute_batch needs tokio::task::block_in_place

Important Files Changed

Filename Overview
core/src/stages/execution.rs Adds parallel SVM execution via std::thread::scope, but calls it directly from an async fn / tokio task without block_in_place, which blocks a tokio worker thread during execution. Also adds latency metrics and batching knobs — overall structure is sound.
bench-tps/src/load.rs Rewrites sender path to async with join_all for per-batch concurrency. Shares the mpsc::Receiver behind an Arc<Mutex>, which delays cancellation for tasks blocked on the mutex; otherwise functionally correct.
core/src/accounts/write_batch.rs Replaces per-row inserts with bulk UNNEST upserts and pre-serializes all data before opening the Postgres transaction. Deduplication invariant is met by callers; minor inconsistency in missing ::bytea[] cast on the DELETE query.
core/src/stages/sequencer.rs Switches to bounded batch channel, adds configurable deadline-based batching, and introduces a non-blocking flush path for shutdown. Logic and tests look correct.
core/src/stages/settle.rs Adds MissedTickBehavior::Delay, biased select for shutdown priority, final-flush on loop exit, and settler latency metrics. All changes look safe and well-tested.
core/src/stages/dedup.rs Migrates from tokio_mpmc to async_channel::bounded; adds ingest_blockhashes helper to keep the live-hash window current even when backpressure stalls the output channel. Logic is sound.
core/src/vm/gasless_callback.rs Adds SnapshotCallback — a thread-safe, owned-HashMap read-only snapshot of BOB for parallel SVM workers. Correctly implements TransactionProcessingCallback and has good unit tests.
core/src/accounts/bob.rs Defers the O(N) eviction sweep to every GC_EVICTION_INTERVAL batches; preload now returns (fetched, cached) stats; tests updated to force the eviction path. All changes are safe.
core/src/stage_metrics.rs Adds latency histogram methods to StageMetrics for executor and settler phases; both NoopMetrics and PrometheusMetrics implement all new methods.
core/src/stages/sigverify.rs Replaces tokio_mpmc with async_channel::bounded; API differences handled correctly. New drain and fairness tests are well-structured.
core/src/accounts/postgres.rs Adds resolve_pool_size() with env-var parsing, default fallback, and ceiling clamp. Comprehensive unit tests cover all edge cases.

Sequence Diagram

sequenceDiagram
    participant RPC as RPC Handler
    participant Dedup as Dedup (async_channel bounded)
    participant SigV as Sigverify Pool (async_channel bounded)
    participant Seq as Sequencer (mpsc bounded)
    participant Exec as Executor (tokio task)
    participant SVM as SVM Workers (std::thread::scope)
    participant Settle as Settler (tick-driven)
    participant PG as Postgres (bulk UNNEST)

    RPC->>Dedup: SanitizedTransaction
    Dedup->>SigV: forward (bounded, backpressure)
    SigV->>Seq: verified tx (unbounded mpsc)
    Seq->>Seq: deadline batch collection
    Seq->>Exec: ConflictFreeBatch (bounded channel)
    Exec->>SVM: execute_parallel (blocks tokio thread)
    SVM-->>Exec: merged LoadAndExecuteOutput
    Exec->>Settle: (output, txs) via unbounded mpsc
    Settle->>Settle: accumulate until blocktime tick
    Settle->>PG: write_batch (bulk UNNEST upsert)
    Settle->>Dedup: settled_blockhash
    Settle->>Exec: settled_accounts (BOB update)
Loading

Reviews (1): Last reviewed commit: "perf(throughput): async bench, bulk-SQL ..." | Re-trigger Greptile

Comment thread core/src/stages/execution.rs Outdated
Comment thread bench-tps/src/load.rs Outdated
Comment thread core/src/accounts/write_batch.rs
Huzaifa696 added a commit that referenced this pull request Apr 17, 2026
  - execution: wrap execute_parallel in block_in_place to avoid
    stalling the tokio worker during thread::scope.
  - bench-tps: swap Arc<Mutex<mpsc::Receiver>> for async_channel so
    cancellation hits every sender task directly.
  - accounts: add explicit ::bytea[] cast to bulk DELETE query.
@Huzaifa696 Huzaifa696 requested review from amilz and dev-jodee April 17, 2026 22:58
  - Rewrite bench-tps load phase as async (tokio mpsc + join_all)
  - Replace per-row account/tx writes with bulk UNNEST upserts
  - Drop settler's in-memory tx_count; same-tx read-modify-write
  - Expose CONTRA_PG_MAX_CONNECTIONS (32, cap 256), CONTRA_BATCH_DEADLINE_MS, CONTRA_MAX_SVM_WORKERS
  - Raise container nofile soft/hard to 65536
  - Mirror new env vars and ulimits in docker-compose.devnet.yml
  - execution: wrap execute_parallel in block_in_place to avoid
    stalling the tokio worker during thread::scope.
  - bench-tps: swap Arc<Mutex<mpsc::Receiver>> for async_channel so
    cancellation hits every sender task directly.
  - accounts: add explicit ::bytea[] cast to bulk DELETE query.
  async_channel guarantees no fairness, so the /2 floor flaked on CI.
  Assert each consumer received >0 — enough to prove MPMC fan-out.
  The SVM's AccountLoader treats cached accounts with lamports=0 as
  deallocated and returns None on subsequent loads in the same batch.
  BOB's system_program and rent sysvar precompiles, plus admin-created
  mints, all had lamports=0 — breaking ATA creation when multiple txs
  landed in one batch (exposed by batch_deadline_ms coalescing).
@Huzaifa696 Huzaifa696 force-pushed the feat/transfer-throughput-improvement branch from d1617c8 to 41ea473 Compare April 17, 2026 23:44
@Huzaifa696 Huzaifa696 changed the title perf(throughput): async bench, bulk-SQL settler, tunable node knobs perf: increase end-to-end transfer TPS Apr 18, 2026
@dev-jodee dev-jodee merged commit 51bb875 into main Apr 20, 2026
10 checks passed
@dev-jodee dev-jodee deleted the feat/transfer-throughput-improvement branch April 20, 2026 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants