Skip to content

Latest commit

 

History

History
19 lines (13 loc) · 836 Bytes

File metadata and controls

19 lines (13 loc) · 836 Bytes

Deepspeed's ALST/Ulysses sequence parallelism

This is an example of the use of Ulysses Sequence Parallelism, which uses attention head parallelism and is part of the Arctic Long Sequence Training project at ArcticTraining. This paper goes into the details of this protocol.

For nuances of usage please refer to the main HF Accelerate tutorial on Context Parallelism.

You need to use at least 2 gpus to enable ALST/Ulysses sequence parallelism.

To run the example with 4 gpus:

bash ./sp-alst.sh

Change 4 to the desired sequence parallelism degree in these 2 files:

sp-alst.accelerate-config.yml:num_processes: 4
sp-alst.py:    sp_size=4,