PT5_LoRA_Finetuning_per_prot.ipynb - memory accumulation during validation

Hi all,

I am currently experimenting with your provided code. Your plot indicating memory usage for the different batch sizes & max_length seems to fit perfectly for our setup for training. However, when monitoring the memory usage two things are noticeable:
1. Memory seems to not be freed after training
2. Memory seems to accumulate during validation.

I could not find a solution for 1.

For 2. it seems to work, to set eval_accumulation_steps, which is transferring the model outputs to CPU.

Do you have an idea?

Keep up the great work.


Best wishes,
Frederik

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PT5_LoRA_Finetuning_per_prot.ipynb - memory accumulation during validation #153

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

PT5_LoRA_Finetuning_per_prot.ipynb - memory accumulation during validation #153

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions