Skip to content

Commit 56cda99

Browse files
committed
Add Slurm with DLAMI nccl tests
1 parent 27b7d49 commit 56cda99

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

micro-benchmarks/nccl-tests/README.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,14 +113,26 @@ To run the NCCL tests on EKS, you will need to build the container image, then p
113113

114114
## 2. Running the NCCL Tests
115115

116-
### Slurm
116+
### Slurm with container
117117

118118
Copy the file `slurm/nccl-tests.sbatch` or its content on your cluster then submit a preprocessing jobs with the command below:
119119

120120
```bash
121121
sbatch nccl-tests.sbatch
122122
```
123123

124+
### Slurm with Deep Learning AMI
125+
126+
In this step, you can use the NCCL tests already compiled in the deep learning AMI.
127+
128+
Copy the file `slurm/nccl-tests-deep-learning-ami.sbatch` or its content on your cluster then submit a preprocessing jobs with the command below:
129+
130+
```bash
131+
sbatch nccl-tests.sbatch
132+
```
133+
134+
### Results
135+
124136
All_reduce performance test will be executed from 8B to 2GB on 2x p4de.24xlarg, the output should look as below (with a lot more information).
125137
```txt
126138
0: # size count type redop root time algbw busbw #wrong time algbw busbw #wrong

0 commit comments

Comments
 (0)