Skip to content

Conversation

bhartnett
Copy link
Contributor

@bhartnett bhartnett commented Oct 14, 2025

This PR updates the KVT txFrame API to support multiple column families and also introduces and uses separate column families for contract code and witnesses.

For performance reasons the kvts are stored in an array indexed by the column family enum type. The KvtCfs enum type is renamed to KvtType and moved so that it can be exposed as part of the CoreDb API in the base module. When a KvtType is not specified then the default KvtType.Generic is used.

When using column families the data is partitioned into a separate space so the DBKeyKind key mapping functions in the storage_types module are no longer required for the contract code and witness reads and writes.

This changes the structure of the database on disk and is therefore not backwards compatible meaning that after this change is merged then all nodes will require a full re-sync.

@bhartnett
Copy link
Contributor Author

I ran a block import benchmark on the first 10 million mainnet blocks. Here are the results:

master.csv vs cfs.csv
                        bps_x     bps_y      tps_x      tps_y    time_x    time_y    bpsd    tpsd   timed
block_number                                                                                             
(499713, 1555300]    6,182.15  6,284.72  21,221.80  21,462.51     3m12s     3m10s   1.88%   1.88%  -0.95%
(1555300, 2610888]   3,044.54  3,039.45  22,303.01  22,282.89    21m44s    21m47s   0.27%   0.27%   0.21%
(2610888, 3666475]   2,829.84  2,833.75  25,983.09  25,963.67    13m49s    13m55s   0.13%   0.13%   0.24%
(3666475, 4722063]     423.39    419.01  22,901.83  22,699.26     58m0s    58m25s  -0.85%  -0.85%   0.87%
(4722063, 5777650]     129.25    127.46  17,984.94  17,746.14  2h18m12s   2h20m7s  -1.37%  -1.37%   1.40%
(5777650, 6833238]     125.00    124.14  12,418.48  12,334.55   2h22m5s   2h23m1s  -0.67%  -0.67%   0.67%
(6833238, 7888825]     120.04    118.71  12,230.86  12,093.94   2h27m1s  2h28m40s  -1.11%  -1.11%   1.12%
(7888825, 8944413]     110.33    109.30  12,485.44  12,366.73  2h40m55s  2h42m25s  -0.93%  -0.93%   0.94%
(8944413, 10000001]     92.63     92.08   9,825.64   9,772.01  3h14m47s  3h15m48s  -0.54%  -0.54%   0.58%

blocks: 9492096, baseline: 14h39m48s, contender: 14h47m22s
Time (total): 7m33s, 0.86%

bpsd = blocks per sec diff (+), tpsd = txs per sec diff, timed = time to process diff (-)
+ = more is better, - = less is better

Unfortunately, there doesn't appear to be any performance improvement from this change. Perhaps the overhead of the additional kvts is more significant then any potential speed up from using the additional column families.

@arnetheduck
Copy link
Member

arnetheduck commented Oct 18, 2025

Additional column families are expensive to manage and cause the WAL to expand - in general, "common prefixes" in keys are almost free (the prefix is stored separately) so the prefix strategy is to be preferred unless different column family options are needed and a few other special cases (like txframe lifetime).

@arnetheduck
Copy link
Member

Contract code is interesting in that it has slightly different lifetime properties than other transaction data: it is shared between accounts - this makes its lifetime management slightly different from that of block data for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants