Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] RL goes brrr #533

Merged
merged 9 commits into from
Mar 24, 2025
Merged

[WIP] RL goes brrr #533

merged 9 commits into from
Mar 24, 2025

Conversation

lewtun
Copy link
Member

@lewtun lewtun commented Mar 21, 2025

Adds support to run GRPO with vllm in multi-node settings. Needs huggingface/trl#3094 to be merged first

TODO

  • Add doc
  • Validate we can train multi-node with 7B model
  • Pin trl to new release

@edbeeching edbeeching mentioned this pull request Mar 21, 2025
4 tasks
Copy link
Member Author

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes! LGTM with some questions about the settings of some args.

Could you also update the README as running the GRPO training now requires one to spin up the server on e.g. 1 device and then running training on the remaining 7

@edbeeching edbeeching merged commit 8000dd2 into main Mar 24, 2025
1 check passed
@edbeeching edbeeching deleted the rl-goes-brrr branch March 24, 2025 14:15
@qgallouedec qgallouedec mentioned this pull request Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants