Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved memory management. #5450

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Improved memory management. #5450

wants to merge 6 commits into from

Conversation

comfyanonymous
Copy link
Owner

These changes makes the memory management less fragile (much less chances of custom nodes/extensions/future code changes breaking it) and should remove the noticeable delay when changing workflows with large models.

The reason I'm making a PR with these changes so people can test it and make sure there's no obvious bugs before I merge it.

@asagi4
Copy link
Contributor

asagi4 commented Nov 3, 2024

I've been testing this and it seems to work mostly fine.

However, my prompt control nodes seem to have some problems with LoRA switching since patch_model() in ModelPatcher doesn't appear to modify the model weights anymore. I fixed that by explicitly executing load_models_gpu() after doing LoRA swapping, but it's kind of slow.

Should LoadedModel.model_load pass force_patch_weights to model_use_more_vram? It's currently just ignoring the parameter apparently.

I also have another problem where some model reference becomes None if I switch models, but I haven't figured out why and if that's also a bug in some of the nodes I use. I'll try to see if I can actually reproduce that problem with a simpler workflow.

It's likely these are just bugs in how my custom nodes, but I thought I'd let you know anyway.

@comfyanonymous
Copy link
Owner Author

Should LoadedModel.model_load pass force_patch_weights to model_use_more_vram? It's currently just ignoring the parameter apparently.

Good catch.

However, my prompt control nodes seem to have some problems with LoRA switching since patch_model() in ModelPatcher doesn't appear to modify the model weights anymore. I fixed that by explicitly executing load_models_gpu() after doing LoRA swapping, but it's kind of slow.

The code of the patch_model function hasn't changed, how are you using it?

@asagi4
Copy link
Contributor

asagi4 commented Nov 4, 2024

The code of the patch_model function hasn't changed, how are you using it?

I install a monkey patch that hijacks the callback during sampling to add and remove LoRA patches and then calls patch_model to update the weights in memory before the next step, and for whatever reason that stopped working with these patches until I forced the model to be loaded onto the GPU. I'm not sure what exactly is going wrong with it.

The whole thing is honestly a huge pile of hacks so it's entirely possible it worked merely by accident before and this change is just exposing some bugs.

@comfyanonymous
Copy link
Owner Author

This will be merged in: #5583 so please go test that one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants