[WIP] RL goes brrr by lewtun · Pull Request #533 · huggingface/open-r1

lewtun · 2025-03-21T12:43:08Z

Adds support to run GRPO with vllm in multi-node settings. Needs huggingface/trl#3094 to be merged first

TODO

Add doc
Validate we can train multi-node with 7B model
Pin trl to new release

recipes/DeepSeek-R1-Distill-Qwen-1.5B/grpo/config_demo.yaml

slurm/train.slurm

lewtun

Thanks for the fixes! LGTM with some questions about the settings of some args.

Could you also update the README as running the GRPO training now requires one to spin up the server on e.g. 1 device and then running training on the remaining 7

slurm/train.slurm

lewtun added 3 commits March 21, 2025 08:57

Fix vLLM recipes

3cf1ded

Add vllm server to Slurm

5e8d2e2

Add overlap across srun

16d7ab9

lewtun commented Mar 21, 2025

View reviewed changes

recipes/DeepSeek-R1-Distill-Qwen-1.5B/grpo/config_demo.yaml Show resolved Hide resolved

lewtun commented Mar 21, 2025

View reviewed changes

slurm/train.slurm Outdated Show resolved Hide resolved

edbeeching mentioned this pull request Mar 21, 2025

WIP "Faster" grpo trainer #371

Closed

4 tasks

edbeeching reviewed Mar 21, 2025

View reviewed changes

slurm/train.slurm Outdated Show resolved Hide resolved

lewtun and others added 3 commits March 21, 2025 14:41

Fix NUM_NODES

7e6129f

Refactor TP to script

a531399

fix train script to work withnew GRPO

27b956c

lewtun commented Mar 24, 2025

View reviewed changes

edbeeching added 3 commits March 24, 2025 13:07

lewis nits

55e81c2

Merge branch 'main' into rl-goes-brrr

910af99

bump trl, transformers

97c9bd2

edbeeching merged commit 8000dd2 into main Mar 24, 2025
1 check passed

edbeeching deleted the rl-goes-brrr branch March 24, 2025 14:15

qgallouedec mentioned this pull request Mar 24, 2025

trl version mismatch #541

Closed

lewtun mentioned this pull request Mar 27, 2025

Restore single-node instructions to run GRPO #549

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] RL goes brrr#533

[WIP] RL goes brrr#533
edbeeching merged 9 commits intomainfrom
rl-goes-brrr

lewtun commented Mar 21, 2025 •

edited by edbeeching

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lewtun left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lewtun commented Mar 21, 2025 • edited by edbeeching Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lewtun left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lewtun commented Mar 21, 2025 •

edited by edbeeching

Loading

lewtun left a comment •

edited

Loading