generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[GRPO] Add metrics for low and high clipped token probabilities
#3289
opened Apr 14, 2025 by
lewtun
Loading…
5 tasks
Modified GRPOTrainer to accumulate gradient within a single training batch
#3288
opened Apr 13, 2025 by
jarrelscy
Loading…
3 of 5 tasks
Add OpenAI-compatible vLLM server with weight synchronization
#3285
opened Apr 12, 2025 by
BjarniHaukur
•
Draft
5 tasks
☝️ [GRPO] Generate once per effective batch
#3283
opened Apr 12, 2025 by
qgallouedec
Loading…
5 tasks
[NOT MEANT TO BE MERGED] Log correct/incorrect lengths
#3263
opened Apr 8, 2025 by
qgallouedec
•
Draft
[🐯+GRPO] Support FSDP + Fix bug when using LigerGRPO with DDP
#3260
opened Apr 8, 2025 by
shivam15s
Loading…
1 of 5 tasks
feat(trainer): Support multi-role & consecutive turns in DataCollatorForCompletionOnlyLM (#3223)
#3224
opened Apr 3, 2025 by
Kirili4ik
Loading…
4 tasks done
Support for Models With Pre-Finetuned LoRA Adapters in GRPO: Add use_peft_as_reference Flag
#3196
opened Mar 31, 2025 by
LoganVegnaSHOP
Loading…
5 tasks done
GRPO: Scalable training with one LLM/node
#3186
opened Mar 31, 2025 by
jglaser
Loading…
3 of 5 tasks
🚀 Enhance GRPO VLLM server from sync to async and accelerate training
#3182
opened Mar 30, 2025 by
binary-husky
Loading…
Co-Locating vLLM w/ training to achieve higher throughput and GPU utilization
#3162
opened Mar 26, 2025 by
toslali-ibm
Loading…
2 of 5 tasks
Extend BCO Trainer dataset format support
#3134
opened Mar 22, 2025 by
reihig-ut
Loading…
1 of 5 tasks
Add GRPO/ Online DPO support for quantitative models when use vllm as infer backbone.
#3133
opened Mar 22, 2025 by
maoulee
Loading…
improvement(utils.py): simplify repeating completion string
#3122
opened Mar 20, 2025 by
tpoisonooo
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.