Skip to content

Conversation

@gengdy1545
Copy link
Collaborator

@gengdy1545 gengdy1545 commented Oct 29, 2025

Load: 16 threads performing load + update operations, each thread loading 1kw and then updating 100w

Benchmark Test

Using String (Thread ID + num) as key, num as value

Load Duration (s) Update Throughput (ops/s)
ConcurrentHashMap 9.29 20592020.59
Guava (1kw) 189.56 775682.36
Guava (2e) 167.30 740260.94
Caffeine (1kw) 145.77 1353523.39
Caffeine (2e) 48.84 12130401.82

Cache Pre-experiment

Using IndexProto.IndexKey as key, num as value

Load Duration (s) Update Throughput (ops/s)
ConcurrentHashMap 467.14 291630.21
Caffeine (1kw)
Caffeine (2e) timeout

Cache Optimization Approach

Key type significantly impacts performance, primarily due to suboptimal custom hash implementation
Attempted to decompose IndexProto.IndexKey: concatenate tableId, indexId, and key into a String for the key; concatenate timestamp and rowId into a String for the value

Load Duration (s) Update Throughput (ops/s)
ConcurrentHashMap 25.95 108769544.45
Caffeine (1kw) 183.53 960730.15
Caffeine (2e) 85.44 9900990.10

Cache Optimization

Conducting actual load experiments using the above optimization approach

Load Duration (s) Update Throughput (ops/s)
Disabled 337.60 376151.97
ConcurrentHashMap 330.87 467112.37
Caffeine (1kw) 337.94 455308.61
Caffeine (2e) 343.50 459334.54

Test Details

  • The test program's JVM was configured with -Xmx204800m, i.e., 200G
  • num represents the current iteration value; e.g., for load, num ranges from 0 to (1kw - 1)
  • Update operations in load were replaced with get + put using keys corresponding to the current iteration value
  • Both Guava and Caffeine were configured with a 300-second expiration time, preventing cache invalidation during testing
  • Total inserts: 1.6e + 1600kw = 1.76e. A cache size of 2e is sufficient for optimal performance.
  • Guava was excluded immediately after benchmarking and is not included in subsequent experiments.
  • The timestamp in keys for the cache pre-experiment was set to 0 to ensure normal get operations.
  • The Caffeine pre-experiment took over 27 minutes to load and was therefore discontinued.

@gengdy1545 gengdy1545 linked an issue Oct 29, 2025 that may be closed by this pull request
@gengdy1545 gengdy1545 self-assigned this Oct 29, 2025
@ben-manes
Copy link

ben-manes commented Oct 29, 2025

You might consider AsyncCache so that the computation is decoupled from the hash table's bin locking or set a high initialCapacity to reduce the hashbin collisions.

https://github.com/ben-manes/caffeine/wiki/Faq#write-contention

I definitely agree that CHM's spreader is not very robust, since it relies on red-black trees to avoid collision attacks, but that leads to higher hashbin contention. Your change to a more robust hashCode is excellent.

@bianhq bianhq added the enhancement New feature or request label Nov 3, 2025
@bianhq bianhq added this to the Real-time CRUD milestone Nov 3, 2025
@gengdy1545 gengdy1545 force-pushed the feature/optCachingIndex branch from 1424ccd to 9612900 Compare November 4, 2025 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[pixels-common] optimize index with cache

3 participants