-
Couldn't load subscription status.
- Fork 3k
Closed
Description
Describe the bug
Using interleave_datasets with multiple dataloader workers and a seed set causes the same dataset sampling order across all workers.
Should the seed be modulated with the worker id?
Steps to reproduce the bug
See above
Expected behavior
See above
Environment info
datasetsversion: 3.5.1- Platform: macOS-15.4.1-arm64-arm-64bit
- Python version: 3.12.9
huggingface_hubversion: 0.30.2- PyArrow version: 19.0.1
- Pandas version: 2.2.3
fsspecversion: 2024.12.0
Metadata
Metadata
Assignees
Labels
No labels