You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It did stabilize a little bit at around 39GB with use_unsloth_gc. There is also the question why val process memory is much higher than training, 80*2 is almost full, with per_device_train_batch_size: 2, gradient_accumulation_steps: 2, per_device_eval_batch_size: 2
Reminder
System Info
environment:
yaml:
GPU memory情况:
显存总是有一张卡占用特别高,另一张卡吃一半,训练到一定时间就自动oom了,这咋解决呀~~
Reproduction
Others
No response
The text was updated successfully, but these errors were encountered: