Skip to content

[WIP] RL goes brrr#533

Merged
edbeeching merged 9 commits intomainfrom
rl-goes-brrr
Mar 24, 2025
Merged

[WIP] RL goes brrr#533
edbeeching merged 9 commits intomainfrom
rl-goes-brrr

Conversation

@lewtun
Copy link
Member

@lewtun lewtun commented Mar 21, 2025

Adds support to run GRPO with vllm in multi-node settings. Needs huggingface/trl#3094 to be merged first

TODO

  • Add doc
  • Validate we can train multi-node with 7B model
  • Pin trl to new release

@edbeeching edbeeching mentioned this pull request Mar 21, 2025
4 tasks
Copy link
Member Author

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes! LGTM with some questions about the settings of some args.

Could you also update the README as running the GRPO training now requires one to spin up the server on e.g. 1 device and then running training on the remaining 7

@edbeeching edbeeching merged commit 8000dd2 into main Mar 24, 2025
1 check passed
@edbeeching edbeeching deleted the rl-goes-brrr branch March 24, 2025 14:15
@qgallouedec qgallouedec mentioned this pull request Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants