Skip to content

[BFTree] Add RangeIndex cluster migration#1731

Draft
tiagonapoli wants to merge 1 commit intodevfrom
tiagonapoli/bftree-migration
Draft

[BFTree] Add RangeIndex cluster migration#1731
tiagonapoli wants to merge 1 commit intodevfrom
tiagonapoli/bftree-migration

Conversation

@tiagonapoli
Copy link
Copy Markdown
Collaborator

RangeIndex keys are backed by a live BfTree instance whose on-disk state lives in a separate file (data.bftree) and whose main-store record is a 35-byte stub carrying a process-local TreeHandle. Shipping just the stub bytes is not enough — the destination needs both the tree's on-disk snapshot and a freshly-recovered local TreeHandle.

This change adds end-to-end migration for MIGRATE SLOTS, mirroring the Vector Set migration pattern but transferring a GZip-compressed native snapshot of the tree in chunks over the existing migration transport.

Wire format (piggybacks on TryWriteRecordSpan):

  • New MigrationRecordSpanType.RangeIndexSnapshotChunk (4): [int keyLen][key][int chunkIndex][int totalChunks][long uncompressedLen][int chunkLen][bytes]
  • New MigrationRecordSpanType.RangeIndexStub (5) — finalizer: [int keyLen][key][35-byte stub]

Source side:

  • MigrateScanFunctions.Reader captures RangeIndex records (RecordType == 2) onto MigrateOperation.RangeIndexes instead of the regular DiskLogRecord path.
  • RangeIndexManager.SnapshotForMigration acquires the per-tree exclusive lock, native-snapshots the tree to migrate.bftree, releases the lock, then GZip-compresses to migrate.bftree.gz.
  • MigrateSessionRangeIndex.TransmitRangeIndex streams 256 KB chunks followed by the stub finalizer, deletes the source main-store key, and cleans up temporary files.
  • MigrateSessionSlots loops over migrateOperation.SelectMany(mo => mo.RangeIndexes) after the Vector Set block.

Destination side:

  • RangeIndexManager.HandleMigratedRangeIndexChunk appends chunks to restore.bftree.gz (validates monotonic ordering per key).
  • RangeIndexManager.HandleMigratedRangeIndexStub decompresses to data.bftree, recovers a fresh BfTreeService via BfTreeService.RecoverFromSnapshot, rewrites the stub's TreeHandle to the new local pointer, issues an RICREATE RMW inline (using StorageSession.CompletePendingForSession), and registers the tree on success.
  • RespClusterMigrateCommands.NetworkClusterMigrate dispatches both new span types; the stub branch also validates the key's hash slot is importing.

Not included in this PR (tracked separately):

  • MIGRATE KEYS path — explicit per-key command requires a pre-probe to detect RangeIndex keys before TransmitKeys. To be added once the SLOTS path is validated end-to-end.
  • AOF replication of migrated trees to destination replicas.
  • Cluster migration integration tests.

RangeIndex keys are backed by a live BfTree instance whose on-disk state
lives in a separate file (data.bftree) and whose main-store record is a
35-byte stub carrying a process-local TreeHandle. Shipping just the stub
bytes is not enough — the destination needs both the tree's on-disk
snapshot and a freshly-recovered local TreeHandle.

This change adds end-to-end migration for MIGRATE SLOTS, mirroring the
Vector Set migration pattern but transferring a GZip-compressed native
snapshot of the tree in chunks over the existing migration transport.

Wire format (piggybacks on TryWriteRecordSpan):
- New MigrationRecordSpanType.RangeIndexSnapshotChunk (4):
    [int keyLen][key][int chunkIndex][int totalChunks][long uncompressedLen][int chunkLen][bytes]
- New MigrationRecordSpanType.RangeIndexStub (5) — finalizer:
    [int keyLen][key][35-byte stub]

Source side:
- MigrateScanFunctions.Reader captures RangeIndex records (RecordType == 2)
  onto MigrateOperation.RangeIndexes instead of the regular DiskLogRecord path.
- RangeIndexManager.SnapshotForMigration acquires the per-tree exclusive lock,
  native-snapshots the tree to migrate.bftree, releases the lock, then
  GZip-compresses to migrate.bftree.gz.
- MigrateSessionRangeIndex.TransmitRangeIndex streams 256 KB chunks followed
  by the stub finalizer, deletes the source main-store key, and cleans up
  temporary files.
- MigrateSessionSlots loops over migrateOperation.SelectMany(mo => mo.RangeIndexes)
  after the Vector Set block.

Destination side:
- RangeIndexManager.HandleMigratedRangeIndexChunk appends chunks to
  restore.bftree.gz (validates monotonic ordering per key).
- RangeIndexManager.HandleMigratedRangeIndexStub decompresses to data.bftree,
  recovers a fresh BfTreeService via BfTreeService.RecoverFromSnapshot,
  rewrites the stub's TreeHandle to the new local pointer, issues an
  RICREATE RMW inline (using StorageSession.CompletePendingForSession),
  and registers the tree on success.
- RespClusterMigrateCommands.NetworkClusterMigrate dispatches both new span
  types; the stub branch also validates the key's hash slot is importing.

Not included in this PR (tracked separately):
- MIGRATE KEYS path — explicit per-key command requires a pre-probe to
  detect RangeIndex keys before TransmitKeys. To be added once the SLOTS
  path is validated end-to-end.
- AOF replication of migrated trees to destination replicas.
- Cluster migration integration tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant