Skip to content

FIX: orphaned commits + TEST: GC stateful #1940

Open
ianhi wants to merge 5 commits intomainfrom
ian/gc-subset-test
Open

FIX: orphaned commits + TEST: GC stateful #1940
ianhi wants to merge 5 commits intomainfrom
ian/gc-subset-test

Conversation

@ianhi
Copy link
Copy Markdown
Collaborator

@ianhi ianhi commented Mar 31, 2026

I added two versions of stateful tests that catch #1931

  1. A subset of our existing test
    pro: no new tests
    con: had to add a new commit method and bunch of complicated hypothesis subset machinery

  2. a. smaller GC focused statemachine

pro: catches bug very fast
con: adds another stateful test

we should chose which one we want and then I'll remove the other.

Steps:
  Initial -> A -> B          create two commits
  reset main -> Initial      orphan A and B
  expire (A old, B recent)   A released; B retained with parent_id=None
  reset main -> B            point branch at the corrupted snapshot
  amend                      panic

ianhi and others added 3 commits March 31, 2026 16:13
Add icechunk.testing.stateful_subsets with a test_subsets() function
that runs a Hypothesis state machine with only a subset of rules
enabled. Uses __wrapped__ to strip @rule markers while keeping the
method implementation intact.

Use it to add test_gc_with_rules, which runs VersionControlStateMachine
with only the 6 GC-relevant rules, reliably catching the expire_v2
orphan parent bug.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When expire_v2 removed an intermediate snapshot whose child was
unreachable from any ref (e.g. after reset_branch), the child's
parent_id was set to None instead of INITIAL_SNAPSHOT_ID. Fix
new_parent to fall back to INITIAL_SNAPSHOT_ID for unreachable
snapshots, matching the pattern from PR #1708 for garbage_collect.

Add two focused expire tests that reliably catch this bug:
- test_expire_subset: uses test_subsets with empty_commit rule
- ExpireOrphanTest: standalone minimal state machine

Also fix .add(generator) bug in expire_snapshots rule (should be
.update), and add __wrapped__ guard in strip_rule.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ianhi ianhi marked this pull request as ready for review March 31, 2026 20:34
@dcherian
Copy link
Copy Markdown
Collaborator

Lets do (2)

Comment thread icechunk/src/ops/gc.rs
None
// Snapshot not reachable from any ref (e.g. branch was reset away).
// Re-parent to the initial snapshot rather than orphaning with None.
Some(Snapshot::INITIAL_SNAPSHOT_ID)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #1932 i change the expectation so we assert that ancestry ends at a node with parent_id == None.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants