Memory usage "too small" for 7B Llama-2 #174

Linohong · 2024-01-24T06:27:24Z

Hey, thank you very much for sharing a great work.

I want to ask a question regarding gpu memory use in training Llama-7B for full-finetuning.
Normally I face OOM issue when tuning 7B models with batch size starting from 2 or 4.

Here's my A100 GPU memory status when I train LLaMA-2 using LongLora with batch size 1:

[0] NVIDIA A100-SXM4-40GB | 40°C, 70 % | 6999 / 40960 MB |
[1] NVIDIA A100-SXM4-40GB | 39°C, 71 % | 7085 / 40960 MB |
[2] NVIDIA A100-SXM4-40GB | 38°C, 68 % | 7317 / 40960 MB |
[3] NVIDIA A100-SXM4-40GB | 40°C, 70 % | 7293 / 40960 MB |
[4] NVIDIA A100-SXM4-40GB | 40°C, 69 % | 7321 / 40960 MB |
[5] NVIDIA A100-SXM4-40GB | 37°C, 64 % | 7291 / 40960 MB |
[6] NVIDIA A100-SXM4-40GB | 36°C, 65 % | 7435 / 40960 MB |
[7] NVIDIA A100-SXM4-40GB | 39°C, 67 % | 7015 / 40960 MB |

It turns out to be surprisingly low in memory use even if the batch size is 1.
The max length was set 4096, but my data doesn't reach that long.
I also set the low_rank_training to False in handing over the parameter.

Do you think the memory usage is too low?
or is it normal what I'm seeing?
Was there any special technique applied for memory efficiency?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage "too small" for 7B Llama-2 #174

Memory usage "too small" for 7B Llama-2 #174

Linohong commented Jan 24, 2024

Memory usage "too small" for 7B Llama-2 #174

Memory usage "too small" for 7B Llama-2 #174

Comments

Linohong commented Jan 24, 2024