You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Avoid tempKv-driven scan iteration in Garnet callers (#1797)
* Avoid tempKv-driven scan iteration in Garnet callers
Replace the four Garnet/Cluster callsites of `Session.Iterate(...)` (which
allocates a parallel `tempKv` proportional to the keyspace and copies every
in-range record into it) with the lookup-based `Session.IterateLookupSnapshot`,
a new Tsavorite ClientSession method that captures `Log.TailAddress` once
and pins both `untilAddress` and `maxAddress` to it. This preserves the prior
snapshot semantics (each unique live key emitted exactly once based on its
latest in-range version, with the in-flight RCU race window correctly handled)
without the O(N) `tempKv` memory cost.
Converted callsites:
- `StoreWrapper.HasKeysInSlots`: collapses two pull-iterations (string +
object) into a single push scan over the unified context with a cached
early-exit callback (`hasKeysInSlotsFuncs` field on StoreWrapper).
- `StorageSession.DBKeys` (KEYS command): switched the existing cached
`unifiedStoreDbKeysFuncs` callback from `Session.Iterate(...)` to
`IterateLookupSnapshot`.
- `ClusterManagerSlotState.DeleteKeysInSlots`: refactored into a new
`StorageSession.DeleteSlotKeys(slots)` method on the unified context
with a cached `deleteSlotKeysFuncs` callback. Cluster static helper now
one-line delegate. Preserves prior behavior — deletes every matched live
key including expired-but-not-yet-tombstoned ones (no expiry filter).
- `VectorManager.Cleanup`: switched from push `Session.Iterate(ref callbacks)`
to `IterateLookupSnapshot`.
API changes:
- New `ClientSession.IterateLookupSnapshot<TScanFunctions>(ref scanFns,
bool includeTombstones = false)` on Tsavorite. Wraps `ScanCursor` with
the snapshot pattern `untilAddress = maxAddress = capturedTail`.
- Removed parameterless pull `IGarnetApi.IterateStore()` overload (the
only Garnet API method that exposed the tempKv pull iterator). The push
overload `IterateStore<TScanFunctions>(...)` is unchanged. **Breaking
change** for out-of-tree extensions that bind to `IGarnetApi`.
- Removed deprecated `LogCompactionType.ShiftForced` enum value plus its
back-compat shim. `ShiftForced` was deprecated in PR #482 (June 2024,
almost two years ago); long-deprecated values are removed in v2.
Migration: `--compaction-type Shift --compaction-force-delete true`.
**Breaking change** in config surface — `CommandLineParser` produces a
clear startup error listing the now-valid values `None | Shift | Lookup | Scan`.
- Reordered `LogCompactionType` enum members + help-text to
`None | Shift | Lookup | Scan` so users see the recommended `Lookup`
before the not-recommended `Scan`. Added a startup `LogWarning` when
`CompactionType = Scan` is selected to flag the temporary-memory-spike
cost. `Scan` itself still works as before — no behavior change.
All callbacks follow the existing cached-field pattern
(`xxxFuncs ??= new(); xxxFuncs.Initialize(...)`) used by
`unifiedStoreDbKeysFuncs`, `unifiedStoreDbScanFuncs`,
`unifiedStoreDbSizeFuncs`, `expiredKeyDeletionScanFuncs` — steady-state
per-call callback allocations are zero. Net per-call cost on the four
hot paths drops from `{1-2 sessions + 1-2 TsavoriteKV instances + 2
iterators + O(N) record copies}` to `{1 session, 0 callback alloc}`.
Tests added:
- `SpanByteIterateLookupSnapshotBasicCorrectness`
(libs/storage/Tsavorite/cs/test/SpanByteIterationTests.cs):
inserts 200 keys, RCUs all of them, asserts `IterateLookupSnapshot`
emits each unique live key exactly once with the post-RCU value;
asserts the snapshot is stable across calls.
- `ClusterDelKeysInSlotRemovesStringAndObjectKeys`
(test/Garnet.test.cluster/ClusterMigrateTests.cs): exercises
`CLUSTER DELKEYSINSLOT` over both raw-string and collection-object
keys via the unified scan; asserts both are deleted.
- `SeKeysNoDuplicatesUnderRcu`
(test/Garnet.test/RespScanCommandsTests.cs): runs concurrent SET
RCUs while issuing KEYS in a loop, asserts each key is returned
exactly once across multiple rounds.
Documentation:
- `libs/host/defaults.conf` and `website/docs/getting-started/configuration.md`
updated to use the new compaction-type ordering and call out
`Scan` as NOT RECOMMENDED.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address PR review feedback
- ClusterDelKeysInSlotRemovesStringAndObjectKeys: change [Order(22)] to
[Order(28)] to avoid collision with the existing #if DEBUG-gated Order(22)
test in the same fixture (ordered, non-parallelizable execution would
otherwise be ambiguous).
- SeKeysNoDuplicatesUnderRcu: assert that the background writer actually
stops within the wait window (raised from 5s to 30s) and surface any
exception the writer threw, so a hung/faulted writer can't leak a
connection or silently mask test failures.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Strengthen ClusterDelKeysInSlotRemovesStringAndObjectKeys: add control slot
The previous version of the test only verified that DELKEYSINSLOT removed
the targeted keys; it did not verify that keys in OTHER slots are preserved.
A bug in the slot-set filter inside DeleteSlotKeysScan.Reader (e.g., a wrong
`slots.Contains(...)` check or a hash-slot computation regression) would
have collateral-deleted unrelated keys without this test catching it.
Now the test:
- Picks two distinct slots owned by node 0 (delSlot, keepSlot).
- Inserts one string key + one object key in EACH slot (4 keys total).
- Runs CLUSTER DELKEYSINSLOT on delSlot only.
- Asserts delSlot is empty and both delSlot keys are gone via GET/EXISTS.
- Asserts keepSlot still has both keys with original values (string GET
returns 'raw-keep'; SCARD on the object key returns 3).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy file name to clipboardExpand all lines: libs/host/Configuration/Options.cs
+5-6Lines changed: 5 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -236,7 +236,7 @@ internal sealed class Options : ICloneable
236
236
[Option("expired-object-collection-freq",Required=false,HelpText="Frequency in seconds for the background task to perform object collection which removes expired members within object from memory. 0 = disabled. Use the HCOLLECT and ZCOLLECT API to collect on-demand.")]
[Option("compaction-type",Required=false,HelpText="Hybrid log compaction type. Value options: None - no compaction, Shift - shift begin address without compaction (data loss), Scan - scan old pages and move live records to tail (no data loss), Lookup - lookup each record in compaction range, for record liveness checking using hash chain (no data loss)")]
239
+
[Option("compaction-type",Required=false,HelpText="Hybrid log compaction type. Value options: None - no compaction, Shift - shift begin address without compaction (data loss), Lookup - lookup each record in compaction range, for record liveness checking using hash chain (no data loss; recommended for production use), Scan - scan old pages and move live records to tail (no data loss; NOT RECOMMENDED - builds a temporary parallel KV index proportional to the keyspace, causing significant transient memory use; prefer Lookup)")]
240
240
publicLogCompactionTypeCompactionType{get;set;}
241
241
242
242
[OptionValidation]
@@ -773,12 +773,11 @@ endpoint is IPEndPoint listenEp && clusterAnnounceEndpoint[0] is IPEndPoint anno
773
773
thrownewException("Revivification cannot specify RevivifiableFraction without specifying bins.");
774
774
}
775
775
776
-
// For backwards compatibility
777
-
if(CompactionType==LogCompactionType.ShiftForced)
776
+
// Warn users who explicitly opt into Scan compaction about the memory-spike cost.
777
+
// Scan builds a temporary parallel KV index proportional to the keyspace; Lookup is the recommended alternative.
778
+
if(CompactionType==LogCompactionType.Scan)
778
779
{
779
-
logger?.LogWarning("Compaction type ShiftForced is deprecated. Use Shift instead along with CompactionForceDelete.");
780
-
CompactionType=LogCompactionType.Shift;
781
-
CompactionForceDelete=true;
780
+
logger?.LogWarning("Compaction type Scan builds a temporary parallel KV index proportional to the keyspace, causing significant transient memory use. Use Lookup instead unless you have a specific reason for Scan.");
Copy file name to clipboardExpand all lines: libs/host/defaults.conf
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -162,8 +162,8 @@
162
162
/* Hybrid log compaction type. Value options: */
163
163
/* None - no compaction */
164
164
/* Shift - shift begin address without compaction (data loss) */
165
-
/* Scan - scan old pages and move live records to tail (no data loss) */
166
-
/* Lookup - lookup each record in compaction range, for record liveness checking using hash chain (no data loss) */
165
+
/* Lookup - lookup each record in compaction range, for record liveness checking using hash chain (no data loss). Recommended for production use. */
166
+
/* Scan - scan old pages and move live records to tail (no data loss). NOT RECOMMENDED: this strategy builds a temporary parallel KV index proportional to the keyspace, causing significant transient memory use. Prefer Lookup. */
167
167
"CompactionType" : "None",
168
168
169
169
/* Forcefully delete the inactive segments immediately after the compaction strategy (type) is applied. */
usingvariter=store.Iterate<IGarnetObject,IGarnetObject,Empty,SimpleGarnetObjectSessionFunctions>(newSimpleGarnetObjectSessionFunctions());// TODO replace with Push iterator
0 commit comments