Skip to content

Commit b9bf24c

Browse files
committed
Merge bitcoin/bitcoin#34616: Cluster mempool: SFL cost model (take 2)
744d47f clusterlin: adopt trained cost model (feature) (Pieter Wuille) 4eefdfc clusterlin: rescale costs (preparation) (Pieter Wuille) ecc9a84 clusterlin: use 'cost' terminology instead of 'iters' (refactor) (Pieter Wuille) 9e7129d clusterlin: introduce CostModel class (preparation) (Pieter Wuille) Pull request description: Part of #30289, replaces earlier #34138. This introduces a more accurate cost model for SFL, to control how much CPU time is spent inside the algorithm for clusters that cannot be linearized perfectly within a reasonable amount of time. The goal is having a metric for the amount of work performed, so that txmempool can impose limits on that work: a lower bound that is always performed (unless optimality is reached before that point, of course), and an upper bound to limit the latency and total CPU time spent on this. There are conflicting design goals here: * On the one hand, it seems ideal if this metric is closely correlated to actual CPU time, because otherwise the limits become inaccurate. * On the other hand, it seems a nightmare to have the metric be platform/system dependent, as it makes network-wide reasoning nearly impossible. It's expected that slower systems take longer to do the same thing; this holds for everything, and we don't need to compensate for this. There are multiple solutions to this: * One extreme is just measuring the time. This is very accurate, but extremely platform dependent, and also non-deterministic due to random scheduling/cache effects. * The other extreme is using a very abstract metric like counting how many times certain loops/function inside the algorithm run. That is what is implemented in master right now, just counting the sum of the numbers of transactions updated across all `UpdateChunks()` calls. It however necessarily fails to account for significant portions of runtime spent elsewhere, resulting in a rather wide range of "ns per cost" values. * This PR takes a middle ground, counting many function calls / branches / loops, with weights that were determined through benchmarking on an average on a number of systems. Specifically, the cost model was obtained by: * For a variety of machines: * Running a fixed collection of ~385000 clusters found through random generation and fuzzing, optimizing for difficulty of linearization. * Linearize each 1000-5000 times, with different random seeds. Sometimes without input linearization, sometimes with a bad one. * Gather cycle counts for each of the operations included in this cost model, broken down by their parameters. * Correct the data by subtracting the runtime of obtaining the cycle count. * Drop the 5% top and bottom samples from each cycle count dataset, and compute the average of the remaining samples. * For each operation, fit a least-squares linear function approximation through the samples. * Rescale all machine expressions to make their total time match, as we only care about relative cost of each operation. * Take the per-operation average of operation expressions across all machines, to construct expressions for an average machine. * Approximate the result with integer coefficients. The benchmarks were performed by `l0rinc <pap.lorinc@gmail.com>` and myself, on AMD Ryzen 5950X, AMD Ryzen 7995WX, AMD Ryzen 9980X, Apple M4 Max, Intel Core i5-12500H, Intel Core Ultra 7 155H, Intel N150 (Umbrel), Intel Core i7-7700, Intel Core i9-9900K, Intel Haswell (VPS, virtualized), Intel Xeon E5-2637, ARM Cortex-A76 (Raspberry Pi 5), ARM Cortex-A72 (Raspberry Pi 4). Based on final benchmarking, the "acceptable" iteration count (which is the minimum spent on every cluster) is to 75000 units, which corresponds to roughly 50 μs on Ryzen 5950X and similar modern desktop hardware. ACKs for top commit: instagibbs: ACK 744d47f murchandamus: reACK 744d47f Tree-SHA512: 5cb37a6bdd930389937c435f910410c3581e53ce609b9b594a8dc89601e6fca6e6e26216e961acfe9540581f889c14bf289b6a08438a2d7adafd696fc81ff517
2 parents 4035231 + 744d47f commit b9bf24c

File tree

12 files changed

+246
-112
lines changed

12 files changed

+246
-112
lines changed

src/bench/cluster_linearize.cpp

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ void BenchLinearizeOptimallyTotal(benchmark::Bench& bench, const std::string& na
5555
// Benchmark the total time to optimal.
5656
uint64_t rng_seed = 0;
5757
bench.name(bench_name).run([&] {
58-
auto [_lin, optimal, _cost] = Linearize(depgraph, /*max_iterations=*/10000000, rng_seed++, IndexTxOrder{});
58+
auto [_lin, optimal, _cost] = Linearize(depgraph, /*max_cost=*/10000000, rng_seed++, IndexTxOrder{});
5959
assert(optimal);
6060
});
6161
}
@@ -72,15 +72,15 @@ void BenchLinearizeOptimallyPerCost(benchmark::Bench& bench, const std::string&
7272
// Determine the cost of 100 rng_seeds.
7373
uint64_t total_cost = 0;
7474
for (uint64_t iter = 0; iter < 100; ++iter) {
75-
auto [_lin, optimal, cost] = Linearize(depgraph, /*max_iterations=*/10000000, /*rng_seed=*/iter, IndexTxOrder{});
75+
auto [_lin, optimal, cost] = Linearize(depgraph, /*max_cost=*/10000000, /*rng_seed=*/iter, IndexTxOrder{});
7676
total_cost += cost;
7777
}
7878

7979
// Benchmark the time per cost.
8080
bench.name(bench_name).unit("cost").batch(total_cost).run([&] {
8181
uint64_t recompute_cost = 0;
8282
for (uint64_t iter = 0; iter < 100; ++iter) {
83-
auto [_lin, optimal, cost] = Linearize(depgraph, /*max_iterations=*/10000000, /*rng_seed=*/iter, IndexTxOrder{});
83+
auto [_lin, optimal, cost] = Linearize(depgraph, /*max_cost=*/10000000, /*rng_seed=*/iter, IndexTxOrder{});
8484
assert(optimal);
8585
recompute_cost += cost;
8686
}

src/bench/txgraph.cpp

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,9 +51,9 @@ void BenchTxGraphTrim(benchmark::Bench& bench)
5151
static constexpr int NUM_DEPS_PER_BOTTOM_TX = 100;
5252
/** Set a very large cluster size limit so that only the count limit is triggered. */
5353
static constexpr int32_t MAX_CLUSTER_SIZE = 100'000 * 100;
54-
/** Set a very high number for acceptable iterations, so that we certainly benchmark optimal
54+
/** Set a very high number for acceptable cost, so that we certainly benchmark optimal
5555
* linearization. */
56-
static constexpr uint64_t NUM_ACCEPTABLE_ITERS = 100'000'000;
56+
static constexpr uint64_t HIGH_ACCEPTABLE_COST = 100'000'000;
5757

5858
/** Refs to all top transactions. */
5959
std::vector<TxGraph::Ref> top_refs;
@@ -65,7 +65,7 @@ void BenchTxGraphTrim(benchmark::Bench& bench)
6565
std::vector<size_t> top_components;
6666

6767
InsecureRandomContext rng(11);
68-
auto graph = MakeTxGraph(MAX_CLUSTER_COUNT, MAX_CLUSTER_SIZE, NUM_ACCEPTABLE_ITERS, PointerComparator);
68+
auto graph = MakeTxGraph(MAX_CLUSTER_COUNT, MAX_CLUSTER_SIZE, HIGH_ACCEPTABLE_COST, PointerComparator);
6969

7070
// Construct the top chains.
7171
for (int chain = 0; chain < NUM_TOP_CHAINS; ++chain) {

src/cluster_linearize.h

Lines changed: 131 additions & 18 deletions
Large diffs are not rendered by default.

src/test/cluster_linearize_tests.cpp

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,9 +88,15 @@ void TestOptimalLinearization(std::span<const uint8_t> enc, std::initializer_lis
8888
is_topological = false;
8989
break;
9090
}
91-
std::tie(lin, opt, cost) = Linearize(depgraph, 1000000000000, rng.rand64(), IndexTxOrder{}, lin, is_topological);
91+
std::tie(lin, opt, cost) = Linearize(
92+
/*depgraph=*/depgraph,
93+
/*max_cost=*/1000000000000,
94+
/*rng_seed=*/rng.rand64(),
95+
/*fallback_order=*/IndexTxOrder{},
96+
/*old_linearization=*/lin,
97+
/*is_topological=*/is_topological);
9298
BOOST_CHECK(opt);
93-
BOOST_CHECK(cost <= MaxOptimalLinearizationIters(depgraph.TxCount()));
99+
BOOST_CHECK(cost <= MaxOptimalLinearizationCost(depgraph.TxCount()));
94100
SanityCheck(depgraph, lin);
95101
BOOST_CHECK(std::ranges::equal(lin, optimal_linearization));
96102
}

src/test/fuzz/cluster_linearize.cpp

Lines changed: 18 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -984,7 +984,7 @@ FUZZ_TARGET(clusterlin_sfl)
984984

985985
// Verify that optimality is reached within an expected amount of work. This protects against
986986
// hypothetical bugs that hugely increase the amount of work needed to reach optimality.
987-
assert(sfl.GetCost() <= MaxOptimalLinearizationIters(depgraph.TxCount()));
987+
assert(sfl.GetCost() <= MaxOptimalLinearizationCost(depgraph.TxCount()));
988988

989989
// The result must be as good as SimpleLinearize.
990990
auto [simple_linearization, simple_optimal] = SimpleLinearize(depgraph, MAX_SIMPLE_ITERATIONS / 10);
@@ -1011,16 +1011,17 @@ FUZZ_TARGET(clusterlin_linearize)
10111011
{
10121012
// Verify the behavior of Linearize().
10131013

1014-
// Retrieve an RNG seed, an iteration count, a depgraph, and whether to make it connected from
1015-
// the fuzz input.
1014+
// Retrieve an RNG seed, a maximum amount of work, a depgraph, and whether to make it connected
1015+
// from the fuzz input.
10161016
SpanReader reader(buffer);
10171017
DepGraph<TestBitSet> depgraph;
10181018
uint64_t rng_seed{0};
1019-
uint64_t iter_count{0};
1019+
uint64_t max_cost{0};
10201020
uint8_t flags{7};
10211021
try {
1022-
reader >> VARINT(iter_count) >> Using<DepGraphFormatter>(depgraph) >> rng_seed >> flags;
1022+
reader >> VARINT(max_cost) >> Using<DepGraphFormatter>(depgraph) >> rng_seed >> flags;
10231023
} catch (const std::ios_base::failure&) {}
1024+
if (depgraph.TxCount() <= 1) return;
10241025
bool make_connected = flags & 1;
10251026
// The following 3 booleans have 4 combinations:
10261027
// - (flags & 6) == 0: do not provide input linearization.
@@ -1043,8 +1044,14 @@ FUZZ_TARGET(clusterlin_linearize)
10431044
}
10441045

10451046
// Invoke Linearize().
1046-
iter_count &= 0x7ffff;
1047-
auto [linearization, optimal, cost] = Linearize(depgraph, iter_count, rng_seed, IndexTxOrder{}, old_linearization, /*is_topological=*/claim_topological_input);
1047+
max_cost &= 0x3fffff;
1048+
auto [linearization, optimal, cost] = Linearize(
1049+
/*depgraph=*/depgraph,
1050+
/*max_cost=*/max_cost,
1051+
/*rng_seed=*/rng_seed,
1052+
/*fallback_order=*/IndexTxOrder{},
1053+
/*old_linearization=*/old_linearization,
1054+
/*is_topological=*/claim_topological_input);
10481055
SanityCheck(depgraph, linearization);
10491056
auto chunking = ChunkLinearization(depgraph, linearization);
10501057

@@ -1056,8 +1063,8 @@ FUZZ_TARGET(clusterlin_linearize)
10561063
assert(cmp >= 0);
10571064
}
10581065

1059-
// If the iteration count is sufficiently high, an optimal linearization must be found.
1060-
if (iter_count > MaxOptimalLinearizationIters(depgraph.TxCount())) {
1066+
// If the maximum amount of work is sufficiently high, an optimal linearization must be found.
1067+
if (max_cost > MaxOptimalLinearizationCost(depgraph.TxCount())) {
10611068
assert(optimal);
10621069
}
10631070

@@ -1145,7 +1152,7 @@ FUZZ_TARGET(clusterlin_linearize)
11451152

11461153
// Redo from scratch with a different rng_seed. The resulting linearization should be
11471154
// deterministic, if both are optimal.
1148-
auto [linearization2, optimal2, cost2] = Linearize(depgraph, MaxOptimalLinearizationIters(depgraph.TxCount()) + 1, rng_seed ^ 0x1337, IndexTxOrder{});
1155+
auto [linearization2, optimal2, cost2] = Linearize(depgraph, MaxOptimalLinearizationCost(depgraph.TxCount()) + 1, rng_seed ^ 0x1337, IndexTxOrder{});
11491156
assert(optimal2);
11501157
assert(linearization2 == linearization);
11511158
}
@@ -1236,7 +1243,7 @@ FUZZ_TARGET(clusterlin_postlinearize_tree)
12361243

12371244
// Try to find an even better linearization directly. This must not change the diagram for the
12381245
// same reason.
1239-
auto [opt_linearization, _optimal, _cost] = Linearize(depgraph_tree, 100000, rng_seed, IndexTxOrder{}, post_linearization);
1246+
auto [opt_linearization, _optimal, _cost] = Linearize(depgraph_tree, 1000000, rng_seed, IndexTxOrder{}, post_linearization);
12401247
auto opt_chunking = ChunkLinearization(depgraph_tree, opt_linearization);
12411248
auto cmp_opt = CompareChunks(opt_chunking, post_chunking);
12421249
assert(cmp_opt == 0);

src/test/fuzz/txgraph.cpp

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -325,8 +325,8 @@ FUZZ_TARGET(txgraph)
325325
auto max_cluster_count = provider.ConsumeIntegralInRange<DepGraphIndex>(1, MAX_CLUSTER_COUNT_LIMIT);
326326
/** The maximum total size of transactions in a (non-oversized) cluster. */
327327
auto max_cluster_size = provider.ConsumeIntegralInRange<uint64_t>(1, 0x3fffff * MAX_CLUSTER_COUNT_LIMIT);
328-
/** The number of iterations to consider a cluster acceptably linearized. */
329-
auto acceptable_iters = provider.ConsumeIntegralInRange<uint64_t>(0, 10000);
328+
/** The amount of work to consider a cluster acceptably linearized. */
329+
auto acceptable_cost = provider.ConsumeIntegralInRange<uint64_t>(0, 10000);
330330

331331
/** The set of uint64_t "txid"s that have been assigned before. */
332332
std::set<uint64_t> assigned_txids;
@@ -342,7 +342,7 @@ FUZZ_TARGET(txgraph)
342342
auto real = MakeTxGraph(
343343
/*max_cluster_count=*/max_cluster_count,
344344
/*max_cluster_size=*/max_cluster_size,
345-
/*acceptable_iters=*/acceptable_iters,
345+
/*acceptable_cost=*/acceptable_cost,
346346
/*fallback_order=*/fallback_order);
347347

348348
std::vector<SimTxGraph> sims;
@@ -758,9 +758,9 @@ FUZZ_TARGET(txgraph)
758758
break;
759759
} else if (command-- == 0) {
760760
// DoWork.
761-
uint64_t iters = provider.ConsumeIntegralInRange<uint64_t>(0, alt ? 10000 : 255);
762-
bool ret = real->DoWork(iters);
763-
uint64_t iters_for_optimal{0};
761+
uint64_t max_cost = provider.ConsumeIntegralInRange<uint64_t>(0, alt ? 10000 : 255);
762+
bool ret = real->DoWork(max_cost);
763+
uint64_t cost_for_optimal{0};
764764
for (unsigned level = 0; level < sims.size(); ++level) {
765765
// DoWork() will not optimize oversized levels, or the main level if a builder
766766
// is present. Note that this impacts the DoWork() return value, as true means
@@ -773,24 +773,24 @@ FUZZ_TARGET(txgraph)
773773
if (ret) {
774774
sims[level].real_is_optimal = true;
775775
}
776-
// Compute how many iterations would be needed to make everything optimal.
776+
// Compute how much work would be needed to make everything optimal.
777777
for (auto component : sims[level].GetComponents()) {
778-
auto iters_opt_this_cluster = MaxOptimalLinearizationIters(component.Count());
779-
if (iters_opt_this_cluster > acceptable_iters) {
780-
// If the number of iterations required to linearize this cluster
781-
// optimally exceeds acceptable_iters, DoWork() may process it in two
778+
auto cost_opt_this_cluster = MaxOptimalLinearizationCost(component.Count());
779+
if (cost_opt_this_cluster > acceptable_cost) {
780+
// If the amount of work required to linearize this cluster
781+
// optimally exceeds acceptable_cost, DoWork() may process it in two
782782
// stages: once to acceptable, and once to optimal.
783-
iters_for_optimal += iters_opt_this_cluster + acceptable_iters;
783+
cost_for_optimal += cost_opt_this_cluster + acceptable_cost;
784784
} else {
785-
iters_for_optimal += iters_opt_this_cluster;
785+
cost_for_optimal += cost_opt_this_cluster;
786786
}
787787
}
788788
}
789789
if (!ret) {
790-
// DoWork can only have more work left if the requested number of iterations
790+
// DoWork can only have more work left if the requested amount of work
791791
// was insufficient to linearize everything optimally within the levels it is
792792
// allowed to touch.
793-
assert(iters <= iters_for_optimal);
793+
assert(max_cost <= cost_for_optimal);
794794
}
795795
break;
796796
} else if (sims.size() == 2 && !sims[0].IsOversized() && !sims[1].IsOversized() && command-- == 0) {
@@ -1165,7 +1165,7 @@ FUZZ_TARGET(txgraph)
11651165
auto real_redo = MakeTxGraph(
11661166
/*max_cluster_count=*/max_cluster_count,
11671167
/*max_cluster_size=*/max_cluster_size,
1168-
/*acceptable_iters=*/acceptable_iters,
1168+
/*acceptable_cost=*/acceptable_cost,
11691169
/*fallback_order=*/fallback_order);
11701170
/** Vector (indexed by SimTxGraph::Pos) of TxObjects in real_redo). */
11711171
std::vector<std::optional<SimTxObject>> txobjects_redo;

src/test/txgraph_tests.cpp

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,9 @@ BOOST_AUTO_TEST_SUITE(txgraph_tests)
1515

1616
namespace {
1717

18-
/** The number used as acceptable_iters argument in these tests. High enough that everything
18+
/** The number used as acceptable_cost argument in these tests. High enough that everything
1919
* should be optimal, always. */
20-
constexpr uint64_t NUM_ACCEPTABLE_ITERS = 100'000'000;
20+
constexpr uint64_t HIGH_ACCEPTABLE_COST = 100'000'000;
2121

2222
std::strong_ordering PointerComparator(const TxGraph::Ref& a, const TxGraph::Ref& b) noexcept
2323
{
@@ -48,7 +48,7 @@ BOOST_AUTO_TEST_CASE(txgraph_trim_zigzag)
4848
static constexpr int32_t MAX_CLUSTER_SIZE = 100'000 * 100;
4949

5050
// Create a new graph for the test.
51-
auto graph = MakeTxGraph(MAX_CLUSTER_COUNT, MAX_CLUSTER_SIZE, NUM_ACCEPTABLE_ITERS, PointerComparator);
51+
auto graph = MakeTxGraph(MAX_CLUSTER_COUNT, MAX_CLUSTER_SIZE, HIGH_ACCEPTABLE_COST, PointerComparator);
5252

5353
// Add all transactions and store their Refs.
5454
std::vector<TxGraph::Ref> refs;
@@ -111,7 +111,7 @@ BOOST_AUTO_TEST_CASE(txgraph_trim_flower)
111111
/** Set a very large cluster size limit so that only the count limit is triggered. */
112112
static constexpr int32_t MAX_CLUSTER_SIZE = 100'000 * 100;
113113

114-
auto graph = MakeTxGraph(MAX_CLUSTER_COUNT, MAX_CLUSTER_SIZE, NUM_ACCEPTABLE_ITERS, PointerComparator);
114+
auto graph = MakeTxGraph(MAX_CLUSTER_COUNT, MAX_CLUSTER_SIZE, HIGH_ACCEPTABLE_COST, PointerComparator);
115115

116116
// Add all transactions and store their Refs.
117117
std::vector<TxGraph::Ref> refs;
@@ -197,7 +197,7 @@ BOOST_AUTO_TEST_CASE(txgraph_trim_huge)
197197
std::vector<size_t> top_components;
198198

199199
FastRandomContext rng;
200-
auto graph = MakeTxGraph(MAX_CLUSTER_COUNT, MAX_CLUSTER_SIZE, NUM_ACCEPTABLE_ITERS, PointerComparator);
200+
auto graph = MakeTxGraph(MAX_CLUSTER_COUNT, MAX_CLUSTER_SIZE, HIGH_ACCEPTABLE_COST, PointerComparator);
201201

202202
// Construct the top chains.
203203
for (int chain = 0; chain < NUM_TOP_CHAINS; ++chain) {
@@ -270,7 +270,7 @@ BOOST_AUTO_TEST_CASE(txgraph_trim_big_singletons)
270270
static constexpr int NUM_TOTAL_TX = 100;
271271

272272
// Create a new graph for the test.
273-
auto graph = MakeTxGraph(MAX_CLUSTER_COUNT, MAX_CLUSTER_SIZE, NUM_ACCEPTABLE_ITERS, PointerComparator);
273+
auto graph = MakeTxGraph(MAX_CLUSTER_COUNT, MAX_CLUSTER_SIZE, HIGH_ACCEPTABLE_COST, PointerComparator);
274274

275275
// Add all transactions and store their Refs.
276276
std::vector<TxGraph::Ref> refs;
@@ -304,7 +304,7 @@ BOOST_AUTO_TEST_CASE(txgraph_trim_big_singletons)
304304
BOOST_AUTO_TEST_CASE(txgraph_chunk_chain)
305305
{
306306
// Create a new graph for the test.
307-
auto graph = MakeTxGraph(50, 1000, NUM_ACCEPTABLE_ITERS, PointerComparator);
307+
auto graph = MakeTxGraph(50, 1000, HIGH_ACCEPTABLE_COST, PointerComparator);
308308

309309
auto block_builder_checker = [&graph](std::vector<std::vector<TxGraph::Ref*>> expected_chunks) {
310310
std::vector<std::vector<TxGraph::Ref*>> chunks;
@@ -383,7 +383,7 @@ BOOST_AUTO_TEST_CASE(txgraph_staging)
383383
/* Create a new graph for the test.
384384
* The parameters are max_cluster_count, max_cluster_size, acceptable_iters
385385
*/
386-
auto graph = MakeTxGraph(10, 1000, NUM_ACCEPTABLE_ITERS, PointerComparator);
386+
auto graph = MakeTxGraph(10, 1000, HIGH_ACCEPTABLE_COST, PointerComparator);
387387

388388
std::vector<TxGraph::Ref> refs;
389389
refs.reserve(2);

src/test/util/cluster_linearize.h

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -394,26 +394,26 @@ void SanityCheck(const DepGraph<SetType>& depgraph, std::span<const DepGraphInde
394394
}
395395
}
396396

397-
inline uint64_t MaxOptimalLinearizationIters(DepGraphIndex cluster_count)
397+
inline uint64_t MaxOptimalLinearizationCost(DepGraphIndex cluster_count)
398398
{
399399
// These are the largest numbers seen returned as cost by Linearize(), in a large randomized
400400
// trial. There exist almost certainly far worse cases, but they are unlikely to be
401401
// encountered in randomized tests. The purpose of these numbers is guaranteeing that for
402402
// *some* reasonable cost bound, optimal linearizations are always found.
403-
static constexpr uint64_t ITERS[65] = {
403+
static constexpr uint64_t COSTS[65] = {
404404
0,
405-
0, 4, 10, 34, 76, 156, 229, 380,
406-
441, 517, 678, 933, 1037, 1366, 1464, 1711,
407-
2111, 2542, 3068, 3116, 4029, 3467, 5324, 5402,
408-
6481, 7161, 7441, 8183, 8843, 9353, 11104, 11455,
409-
11791, 12570, 13480, 14259, 14525, 12426, 14477, 20201,
410-
18737, 16581, 23622, 28486, 30652, 33021, 32942, 32745,
411-
34046, 26227, 34662, 38019, 40814, 31113, 41448, 33968,
412-
35024, 59207, 42872, 41277, 42365, 51833, 63410, 67035
405+
0, 545, 928, 1633, 2647, 4065, 5598, 8258,
406+
9505, 11471, 14137, 19553, 20460, 26191, 28397, 32599,
407+
41631, 47419, 56329, 57767, 72196, 63652, 95366, 96537,
408+
115653, 125407, 131734, 145090, 156349, 164665, 194224, 203953,
409+
207710, 225878, 239971, 252284, 256534, 222142, 251332, 357098,
410+
325788, 295867, 410053, 497483, 533892, 576572, 577845, 572400,
411+
592536, 455082, 609249, 659130, 714091, 544507, 718788, 562378,
412+
601926, 1025081, 732725, 708896, 738224, 900445, 1092519, 1139946
413413
};
414-
assert(cluster_count < std::size(ITERS));
414+
assert(cluster_count < std::size(COSTS));
415415
// Multiply the table number by two, to account for the fact that they are not absolutes.
416-
return ITERS[cluster_count] * 2;
416+
return COSTS[cluster_count] * 2;
417417
}
418418

419419
} // namespace

0 commit comments

Comments
 (0)