PagedLion8bit crashes NVIDIA GPU when saving state? #1986
ChiNoel-osu
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It's probably just me because there's literally no information about my issue. But I'll start a discussion anyway hopefully we can leave something on the internet.
I have a 4060 laptop (8Gi) in MSHybrid mode (dGPU do rendering and iGPU outputs), Studio driver 572.60. When training it always exceeds the physical VRAM and will use shared VRAM (offloads to DRAM). Thus making the training very slow and also low GPU usage.
The speed is not the problem though, the issue is when it finishes an epoch and saves state, the save will never happen, and the script just stucks. Event Viewer shows
nvlddmkm
event ID 14 that reads:It will do that every single time. However the normal
Lion8bit
optimizer works just fine.I assume using the shared VRAM is what causes the issue? I have no idea how
PagedLion8bit
works so if anyone could shed some light on this I'm very appreciated.Beta Was this translation helpful? Give feedback.
All reactions