You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey, thank you very much for sharing a great work.
I want to ask a question regarding gpu memory use in training Llama-7B for full-finetuning.
Normally I face OOM issue when tuning 7B models with batch size starting from 2 or 4.
Here's my A100 GPU memory status when I train LLaMA-2 using LongLora with batch size 1:
It turns out to be surprisingly low in memory use even if the batch size is 1.
The max length was set 4096, but my data doesn't reach that long.
I also set the low_rank_training to False in handing over the parameter.
Do you think the memory usage is too low?
or is it normal what I'm seeing?
Was there any special technique applied for memory efficiency?
The text was updated successfully, but these errors were encountered:
Hey, thank you very much for sharing a great work.
I want to ask a question regarding gpu memory use in training Llama-7B for full-finetuning.
Normally I face OOM issue when tuning 7B models with batch size starting from 2 or 4.
Here's my A100 GPU memory status when I train LLaMA-2 using LongLora with batch size 1:
[0] NVIDIA A100-SXM4-40GB | 40°C, 70 % | 6999 / 40960 MB |
[1] NVIDIA A100-SXM4-40GB | 39°C, 71 % | 7085 / 40960 MB |
[2] NVIDIA A100-SXM4-40GB | 38°C, 68 % | 7317 / 40960 MB |
[3] NVIDIA A100-SXM4-40GB | 40°C, 70 % | 7293 / 40960 MB |
[4] NVIDIA A100-SXM4-40GB | 40°C, 69 % | 7321 / 40960 MB |
[5] NVIDIA A100-SXM4-40GB | 37°C, 64 % | 7291 / 40960 MB |
[6] NVIDIA A100-SXM4-40GB | 36°C, 65 % | 7435 / 40960 MB |
[7] NVIDIA A100-SXM4-40GB | 39°C, 67 % | 7015 / 40960 MB |
It turns out to be surprisingly low in memory use even if the batch size is 1.
The max length was set 4096, but my data doesn't reach that long.
I also set the low_rank_training to False in handing over the parameter.
Do you think the memory usage is too low?
or is it normal what I'm seeing?
Was there any special technique applied for memory efficiency?
The text was updated successfully, but these errors were encountered: