Using same random values on all MPI tasks in `test_statistics.py` #2090

brownbaerchen · 2026-01-07T12:30:03Z

Due Diligence

General:
- title of the PR is suitable to appear in the Release Notes
Implementation:
- benchmarks: performance improved or maintained
- documentation updated where needed

Description

Running the file test_statistics.py from any directory had two issues: The paths to the datasets were incorrect and there is some funny business with the random seed where, depending on how you run the tests, the random seeds are set up differently.

I finally figured out why. heat.random has a global variable __rng which determines the behavior of the random generator on different MPI tasks. Specifically, when set to Threefry, heat.random.seed gives the same random seed on all ranks, whereas with batchparallel, you get the seed plus the rank.
So if you want to be sure the code is doing what you want, irrespective of what has been run before, you need to use

ht.random.set_state((<'Threefry' or 'batchparallel'>, seed, local_seed))

rather than just

ht.random.seed(seed)

I have to say I am not super happy with this because clearly it can lead to confusion. The names of allowed states of the __rng variable are not helpful in understanding the parallel behavior. I understand that both modes are needed, but maybe the API should be overhauled. If whoever reviews this agrees, feel free to open an issue or ping me to do so.

Issue/s resolved: #2069

Changes proposed:

Set random seed at the beginning of each test
Made paths to datasets relative to heat installation rather than current working directory

Type of change

Bug fix

github-actions · 2026-01-07T12:33:50Z

Thank you for the PR!

codecov · 2026-01-07T13:05:10Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.68%. Comparing base (88adfc3) to head (b3c3825).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2090   +/-   ##
=======================================
  Coverage   91.68%   91.68%           
=======================================
  Files          89       89           
  Lines       13945    13945           
=======================================
  Hits        12786    12786           
  Misses       1159     1159

Flag	Coverage Δ
unit	`91.68% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-01-12T14:45:02Z

Thank you for the PR!

github-actions · 2026-01-13T08:46:13Z

Thank you for the PR!

github-actions · 2026-01-13T08:57:16Z

Thank you for the PR!

ClaudiaComito

Thanks for this @brownbaerchen .

batchparallel was introduced last year (@mrfh92 🙏) because Threefry was /is quite slow and in most cases not necessary. But I agree that the documentation and potentially API needs to make the difference more obvious.

mrfh92 · 2026-01-14T06:35:00Z

Regarding the tests, one also needs to be aware of line 22 in heat/core/tests/test_suites/basic_test.py which sets a seed whenever starting a new test.

The rationale behind "batchparallel" was: threefry is very cool but has two main limitations:

The limit of possible generations is reached fast
quite slow

The batchparallel option is also closer to Heats rationale to take process-local operations from Torch whenever possible. Only disadvantage: The actual random numbers depend on the number of processes, not only on the seed.

mrfh92 · 2026-01-14T06:44:27Z

Indeed, setting a seed requires differences for the two options:

one (global) seed for threefry
one (global) seed that produces local seeds for Torch for the batchparallel option.

In the current implementation, full reproducibility should be given for threefry and partial reproducibility (under same seed and number of processes/larray shapes) should be given for batchparallel.

In my opinion, the larger problem with randomized tests is likely due to the fact that sometimes we set seeds, sometimes not, and additionally due to the recent change mentioned above the test class sets a seed as well.

brownbaerchen · 2026-01-14T08:05:58Z

I am not disagreeing with having both Threefry and batchparallel. In fact, I don't know anything about generating random numbers or these methods in particular.
Actually, after giving it some thought, I think the current API seems fine and it was just used a bit sillily in the test. The setUp method @mrfh92 mentions sets a seed for heat and then the tests generates some random numbers with torch. heat.seed sets a seat for torch, but depending on the __rng global variable, it's either the same or not.
So, the "correct" way of seeding is: seed torch if you want to use torch directly or seed heat if you want to use heat.
Having different torch seeds on different ranks is perfectly reasonable. I remember running a simulation with random perturbations and thinking I have a bug because the solution repeated in space. But then it was just the same seed on all tasks...

github-actions · 2026-01-14T12:56:37Z

Thank you for the PR!

ClaudiaComito · 2026-01-14T12:59:39Z

I think the current API seems fine and it was just used a bit sillily in the test.

I think the team has settled on the formulation "for historical reasons" 😝

These must be some of the first tests we implemented, even before the ht.random module existed. A nice refactoring was long overdue. Thanks @brownbaerchen !

github-actions · 2026-01-14T21:45:58Z

Thank you for the PR!

) * Using same random values on all MPI tasks in test_statistics.py * Add more random seeds to statistics test * Seeding the heat random generator properly in statistics tests --------- Co-authored-by: Claudia Comito <[email protected]> (cherry picked from commit 9e1eaea)

github-actions · 2026-01-15T05:05:19Z

Successfully created backport PR for stable:

[Backport stable] Using same random values on all MPI tasks in test_statistics.py #2103

Using same random values on all MPI tasks in test_statistics.py

5630951

github-project-automation bot added this to Roadmap Jan 7, 2026

github-project-automation bot moved this to Todo in Roadmap Jan 7, 2026

github-actions bot added backport stable bug Something isn't working core labels Jan 7, 2026

Add more random seeds to statistics test

166850a

Seeding the heat random generator properly in statistics tests

71e9b99

Merge branch 'main' into bugs/2069-_Bug_errors_in_test_statistics_py

1b19d6a

brownbaerchen requested a review from ClaudiaComito January 13, 2026 11:20

ClaudiaComito approved these changes Jan 14, 2026

View reviewed changes

github-project-automation bot moved this from Todo to Merge queue in Roadmap Jan 14, 2026

Merge branch 'main' into bugs/2069-_Bug_errors_in_test_statistics_py

ba05d8f

ClaudiaComito added this to the 1.7.1 milestone Jan 14, 2026

Merge branch 'main' into bugs/2069-_Bug_errors_in_test_statistics_py

b3c3825

ClaudiaComito merged commit 9e1eaea into main Jan 15, 2026
10 checks passed

github-project-automation bot moved this from Merge queue to Done in Roadmap Jan 15, 2026

ClaudiaComito deleted the bugs/2069-_Bug_errors_in_test_statistics_py branch January 15, 2026 05:05

github-actions bot mentioned this pull request Jan 15, 2026

[Backport stable] Using same random values on all MPI tasks in test_statistics.py #2103

Open

Using same random values on all MPI tasks in test_statistics.py #2090

Using same random values on all MPI tasks in test_statistics.py #2090

Uh oh!

Conversation

brownbaerchen commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Due Diligence

Description

Changes proposed:

Type of change

Uh oh!

github-actions bot commented Jan 7, 2026

Uh oh!

codecov bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Jan 12, 2026

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

ClaudiaComito left a comment

Choose a reason for hiding this comment

Uh oh!

mrfh92 commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrfh92 commented Jan 14, 2026

Uh oh!

brownbaerchen commented Jan 14, 2026

Uh oh!

github-actions bot commented Jan 14, 2026

Uh oh!

ClaudiaComito commented Jan 14, 2026

Uh oh!

github-actions bot commented Jan 14, 2026

Uh oh!

Uh oh!

github-actions bot commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Using same random values on all MPI tasks in `test_statistics.py` #2090

Using same random values on all MPI tasks in `test_statistics.py` #2090

brownbaerchen commented Jan 7, 2026 •

edited

Loading

codecov bot commented Jan 7, 2026 •

edited

Loading

mrfh92 commented Jan 14, 2026 •

edited

Loading