Skip to content

all-backend ledger: milestone-1 regen prune (torch 1684→ 1485, jax 335→ 292)#969

Merged
jobovy merged 1 commit into
feat/backendsfrom
ledger/milestone-1
Jun 17, 2026
Merged

all-backend ledger: milestone-1 regen prune (torch 1684→ 1485, jax 335→ 292)#969
jobovy merged 1 commit into
feat/backendsfrom
ledger/milestone-1

Conversation

@jobovy

@jobovy jobovy commented Jun 16, 2026

Copy link
Copy Markdown
Owner

What

Milestone-1 reconsideration of the xfail-ledger (tests/backend_xfail.txt) — the policy's "major milestone where the ledger has permanently moved significantly", not a per-#892-integration regen. Source: regen run 27621760108 on regen/milestone-1-ledger, built from the full 13 jax + 13 torch fragments.

Burndown

backend xfail slow_skip total ledger
jax 296 → 253 39 (held) 335 → 292 (−43)
torch 1683 → 1484 1 (held) 1684 → 1485 (−199)

backend_slow_skip.txt is byte-identical (deferred-slow tests are skipped in both modes, never regenerated — to be re-examined for vectorized entries at milestones).

The permanent move that justifies the prune

One truncated shard (carry-forward)

The jax test_SpiralArmsPotential / test_potential / test_scf / test_MultipoleExpansionPotential / test_snapshotpotential shard hit the 75-min session-timeout (ran 259/410), so the 3 prior ledger entries among the 151 that didn't run are carried forward. torch ran all 410 of that shard, so no torch carry-forward. Every other shard ran to completion and is applied in full.

Validation

CI on this PR re-runs the full all-backend matrix in normal mode (applies this ledger); any dropped test that regresses is caught here. (An earlier revision of this PR built the ledger from only the first page of the run's 53 artifacts — an unpaginated listing — which understated it; rebuilt from the complete fragment set.)

🤖 Generated with Claude Code

@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.93%. Comparing base (5fa34c4) to head (afa662e).
⚠️ Report is 1 commits behind head on feat/backends.

Additional details and impacted files
@@              Coverage Diff               @@
##           feat/backends     #969   +/-   ##
==============================================
  Coverage          99.93%   99.93%           
==============================================
  Files                247      247           
  Lines              39375    39375           
  Branches             839      841    +2     
==============================================
  Hits               39350    39350           
  Misses                25       25           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jobovy jobovy force-pushed the ledger/milestone-1 branch from c262e99 to 0f681cc Compare June 16, 2026 20:07
…5->292)

Milestone-1 ledger reconsideration (the policy's "major milestone where the
ledger has permanently moved", not a per-integration regen). Source: regen run
27621760108 on regen/milestone-1-ledger (regen mode = no xfail applied, so every
ledgered test runs for real; nothing is seed-skipped).

The permanent move: #960's central coordinate coercion plus the landed P2.x
potential and Pspecial migrations. Net (xfail): torch 1683->1484, jax 296->253;
slow_skip held byte-identical (torch 1, jax 39) -> total torch 1684->1485,
jax 335->292.

Built from the full set of 13 jax + 13 torch regen fragments. One shard truncated
at the 75-min session-timeout: jax test_SpiralArmsPotential / test_potential /
test_scf / test_MultipoleExpansionPotential / test_snapshotpotential ran 259/410,
so the 3 prior ledger entries among the 151 that did not run are carried forward
(torch ran all 410 of that shard, so no torch carry-forward). Every other shard
ran to completion and is applied in full.

(History: an earlier revision of this commit built the ledger from only the first
page of the run's 53 artifacts -- an unpaginated artifact listing -- which silently
dropped ~11 shards' fragments and understated the ledger; rebuilt here from the
complete fragment set.)

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
@jobovy jobovy force-pushed the ledger/milestone-1 branch from 0f681cc to afa662e Compare June 16, 2026 20:22
@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

All-backend test status (jax / torch)

Commit aab4f3cfcbf1cc26139b8a64c6ac8775dfae4ded

Green is achieved via the checked-in xfail-ledger (tests/backend_xfail.txt, applied xfail(strict=False)), so the metric to watch is the shrinking xfail count (burndown), not a raw pass count. A FAIL/ERR is an un-ledgered regression (reds the run). Because the ledger is non-strict, a now-passing ledgered test is a plain pass here (no per-push XPASS); burndown candidates -- in both directions -- are surfaced by the scheduled regen run, which rewrites the ledger from real outcomes. deferred is a separate burndown: tests skipped because they are unrunnable under the backend until the port is vectorized (see tests/backend_slow_skip.txt), e.g. the jax spherical-DF sampling/quadrature tests pending the Track F DF migration.

Overall: jax: 1326 passed · 250 xfail · 730 deferred | torch: 965 passed · 1488 xfail · 1 deferred

Ledger size: 1737 entries (jax=253, torch=1484).

Test shard jax torch
actionAngle ✅ 113 pass · 88 xfail ✅ 33 pass · 168 xfail
sphericaldf ✅ 164 pass · 26 xfail · 28 deferred ✅ 8 pass · 210 xfail
conversion + util + misc ✅ 86 pass · 5 xfail · 1 deferred ✅ 46 pass · 46 xfail
potential + scf + multipole ✅ 234 pass · 20 xfail · 5 deferred ✅ 188 pass · 222 xfail
quantity + coords ✅ 293 pass · 43 xfail ✅ 197 pass · 139 xfail
orbit (energy/Jacobi + from_name) ✅ 0 pass · 0 xfail · 115 deferred ✅ 20 pass · 95 xfail
orbit + orbits (main) ✅ 0 pass · 0 xfail · 578 deferred ✅ 191 pass · 384 xfail
evolveddiskdf ✅ 35 pass · 0 xfail ✅ 32 pass · 3 xfail
jeans + dynamfric ✅ 17 pass · 2 xfail · 2 deferred ✅ 13 pass · 7 xfail · 1 deferred
qdf + pv2qdf + streamgapdf_impulse + noninertial ✅ 85 pass · 47 xfail · 1 deferred ✅ 28 pass · 105 xfail
streamgapdf ✅ 28 pass · 2 xfail ✅ 27 pass · 3 xfail
diskdf ✅ 129 pass · 0 xfail ✅ 112 pass · 17 xfail
streamdf + streamspraydf + streamTrack ✅ 142 pass · 17 xfail ✅ 70 pass · 89 xfail
Per-shard counts
Test shard backend pass xfail deferred XPASS fail error
actionAngle jax 113 88 0 0 0 0
actionAngle torch 33 168 0 0 0 0
sphericaldf jax 164 26 28 0 0 0
sphericaldf torch 8 210 0 0 0 0
conversion + util + misc jax 86 5 1 0 0 0
conversion + util + misc torch 46 46 0 0 0 0
potential + scf + multipole jax 234 20 5 0 0 0
potential + scf + multipole torch 188 222 0 0 0 0
quantity + coords jax 293 43 0 0 0 0
quantity + coords torch 197 139 0 0 0 0
orbit (energy/Jacobi + from_name) jax 0 0 115 0 0 0
orbit (energy/Jacobi + from_name) torch 20 95 0 0 0 0
orbit + orbits (main) jax 0 0 578 0 0 0
orbit + orbits (main) torch 191 384 0 0 0 0
evolveddiskdf jax 35 0 0 0 0 0
evolveddiskdf torch 32 3 0 0 0 0
jeans + dynamfric jax 17 2 2 0 0 0
jeans + dynamfric torch 13 7 1 0 0 0
qdf + pv2qdf + streamgapdf_impulse + noninertial jax 85 47 1 0 0 0
qdf + pv2qdf + streamgapdf_impulse + noninertial torch 28 105 0 0 0 0
streamgapdf jax 28 2 0 0 0 0
streamgapdf torch 27 3 0 0 0 0
diskdf jax 129 0 0 0 0 0
diskdf torch 112 17 0 0 0 0
streamdf + streamspraydf + streamTrack jax 142 17 0 0 0 0
streamdf + streamspraydf + streamTrack torch 70 89 0 0 0 0

@jobovy jobovy changed the title all-backend ledger: milestone-1 regen prune (torch 1684→442, jax 335→195) all-backend ledger: milestone-1 regen prune (torch 1684→ 1485, jax 335→ 292) Jun 17, 2026
@jobovy jobovy merged commit 18e1349 into feat/backends Jun 17, 2026
123 of 135 checks passed
@jobovy jobovy deleted the ledger/milestone-1 branch June 17, 2026 01:06
jobovy added a commit that referenced this pull request Jun 17, 2026
…5→ 292) (#969)

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
jobovy added a commit that referenced this pull request Jun 17, 2026
Restores the jax entry for this borderline FD-of-flow STM check that the
#969 milestone-1 regen-prune erroneously dropped: the jax SpiralArms/
potential/scf shard truncated mid-run (issue #53), so this test was never
re-run and its xfail entry was pruned away. Under jax the Multipole
(non-axi) coefficients differ from numpy at the coefficient level, pushing
this ~8e-4 FD-of-flow-vs-C-STM comparison just over the 5e-4 tolerance;
numpy passes (verified), the C STM itself is correct. Tolerance unchanged.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
jobovy added a commit that referenced this pull request Jun 17, 2026
Restores the jax entry for this borderline FD-of-flow STM check that the
#969 milestone-1 regen-prune erroneously dropped (the jax SpiralArms/
potential/scf shard truncated mid-run, issue #53). Under jax the Multipole
(non-axi) coefficients differ from numpy at the coefficient level, pushing
this ~8e-4 FD-of-flow-vs-C-STM comparison just over the 5e-4 tolerance;
numpy passes (verified), the C STM is correct. Not a #972 regression — this
PR only adds C-STM autodiff files. Tolerance unchanged.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
jobovy added a commit that referenced this pull request Jun 19, 2026
…5→ 292) (#969)

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
jobovy added a commit that referenced this pull request Jun 19, 2026
…5→ 292) (#969)

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
jobovy added a commit that referenced this pull request Jun 25, 2026
…5→ 292) (#969)

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant