Fix HLO overlap tests adding proper decoding of while loops and conditional branches #10

rfbr · 2025-10-13T16:14:14Z

Fixes failing test_flash_bwd_sharded_hlo tests (when local=False/ring attention) by correcting the HLO decoder to properly traverse while loop bodies and conditional branches.

The issue

Running pytest tests/test_sharding.py fails for test_flash_bwd_sharded_hlo when local=False (ring attention).
The decode_hlo function was incomplete and couldn't see operations inside JAX's scan loops or conditional branches as it only followed calls=, missing body= and condition= used by while loops and branch_computations={...} used by conditional statements. Thus the decoder output was just collective-permute-start collective-permute-done, missing all the custom-call operations inside the loop. Hence, tests failed claiming no communication/computation overlap when overlap was actually working correctly.

What's in this commit

Improved decode_hlo function to also follow body=, condition= and branch_computations=
Implemented a count_overlapped_permutes function to compute number of overlapping and non-overlapping permutes.
Updated test assertions for ring attention (fwd pass should've (N-1) overlapped rotations, 0 adjacent and bwd pass N overlapped rotations + 1 adjacent for the final gradient return)
Constrained sharding tests to 2 devices to avoid GQA incompatibility (when using GQA or MQA, head dimension sharding requires the number of devices to divide evenly into the group size)

conditional branches

Fix HLO overlap tests adding proper decoding of while loops and

5c23c86

conditional branches

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix HLO overlap tests adding proper decoding of while loops and conditional branches #10

Fix HLO overlap tests adding proper decoding of while loops and conditional branches #10

Uh oh!

rfbr commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix HLO overlap tests adding proper decoding of while loops and conditional branches #10

Are you sure you want to change the base?

Fix HLO overlap tests adding proper decoding of while loops and conditional branches #10

Uh oh!

Conversation

rfbr commented Oct 13, 2025

The issue

What's in this commit

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant