The primary purpose of the torchforge ecosystem is to delineate infra concerns from model concerns thereby making RL experimentation easier. torchforge delivers this by providing clear RL abstractions and one scalable implementation of these abstractions. When you need fine-grained control over placement, fault handling/redirecting training loads during a run, or communication patterns, the primitives are there. When you don’t, you can focus purely on your RL algorithm.
Key features:
- Usability for rapid research (isolating the RL loop from infrastructure)
- Hackability for power users (all parts of the RL loop can be easily modified without interacting with infrastructure)
- Scalability (ability to shift between async and synchronous training and across thousands of GPUs)
⚠️ Early Development Warning torchforge is currently in an experimental stage. You should expect bugs, incomplete features, and APIs that may change in future versions. The project welcomes bugfixes, but to make sure things are well coordinated you should discuss any significant change before starting the work. It's recommended that you signal your intention to contribute in the issue tracker, either by filing a new issue or by claiming an existing one.
View torchforge's hosted documentation (coming soon)
You can also find our notebook tutorials (coming soon)
torchforge requires PyTorch 2.9.0 with Monarch, vLLM, and torchtitan. (Note that the basic install script uses DNF, but could be easily extended to other Linux OS.)
conda create -n forge python=3.12
conda activate forge
./scripts/install.sh
Optional: By default, the packages installation uses conda. If user wants to install system packages on the target machine instead of conda, they can pass the --use-sudo
to the installation script: ./script/install.sh --use-sudo
.
After install, you can run the following command and should see output confirming GRPO training is running (you need a minimum 3 GPU devices):
python -m apps.grpo.main --config apps/grpo/qwen3_1_7b.yaml
To run SFT on a Llama3 8B model, run
python -m apps.sft.main --config apps/sft/llama3_8b.yaml
Source code is made available under a BSD 3 license, however you may have other legal obligations that govern your use of other content linked in this repository, such as the license or terms of service for third-party data and models.