-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug]: Segmentation fault on image generation start (AMD) #3967
Comments
Same happening to me with an AMD Radeon 5500 XT with 8GB of VRAM. Something similar has also happening to me pre-3.0, but that issue has been closed since the open issues got reset with the 3.0 release: #2894 (comment) |
I'm also having this issue. When you click the Invoke button, about 5-10 seconds later the console shows the Seg Fault. Freshly installed using the install script on Linux and using the Analog-Diffusion model. System Specs are below. Here is potentially relevant dmesg output:
|
I appear to have been experiencing this issue too, Linux, Radeon 6900XT. Hopefully relevent detail is that I was able to work around it by using torch version 1.13.1+rocm5.2 and corresponding torchvision 0.14.1+rocm5.2 that I still had from my working Invoke 2.3.5 install. After replacing torch 2.0 and torchvision with those older versions Invoke 3.0 now seems to work as expected for me. |
I have the same problem. OS: Artix Linux x86_64 |
Was experiencing this issue on my Ryzen 7950X / Radeon 6900XT desktop system running Arch Linux. I seem to have worked around it by disabling the 7950X's iGPU in BIOS. The GPU device reported by invokeai-web at startup both with and without the iGPU enabled is 'cuda AMD Radeon RX 6900 XT', but for whatever reason having the iGPU enabled seems to have been causing an issue. This issue has been present for me in all versions of invoke since the update to torch 2.0. Tested on a fresh InvokeAI 3.0.1post3 install. |
Yep, same issue for me - 2.3 version worked perfectly, 3.0.1post3 (fresh install) failed with segerror |
Hey! Another person had similar issues with torch and a fix seems to be building a version of python with a different lower torch version (similar to what @arvenig said!): |
Can someone explain in simple words how to achieve it? BTW, I use Python 3.10 as it was suggested for previous InvokeAi version. |
i have the same issue - invoke.sh: line 51: 8792 Segmentation fault (core dumped) invokeai-web $PARAMS |
I have the same issue |
Made it work with rocm 5.4.2 and rx6600 and kernel 5.19 . |
Thank you! |
Unfortunately none of the posted solutions work to resolve the segfault. What I have tried:
ROCM version 5.4.3 Edit: This appears to be an issue with ROCM support for the 7000 series of AMD GPUs. not sure why these are still unsupported 9 months after they came out. guess ill just return this card and get an nvidia gpu :(. |
I just installed in python3.11 venv InvokeAI 3.3.0 with rocm for amd 6600xt and encountered the same issue when pressing "Invoke" Button on the webui. segfault at 20 ip 00007fd2142b40a7 sp 00007fcecfe91470 error 4 in libamdhip64.so[7fd214200000+3f3000] pytorch-triton-rocm 2.0.2 .../InvokeAI/.venv/lib/python3.11/site-packages/triton/third_party/rocm/lib/libamdhip64.so gdb last traces
|
|
Setting gfx made invokeai run for my 6600 XT, but generating the image bugs and returns an invalid image.
|
I'm seeing this with my 7900xtx |
So I figured it out. When using ROCm it tries to select your first GPU which is your integrated graphics. There's not enough VRAM so you get a segmentation fault. There's an environment variable you can use to disable the visibility of the iGPU.
I found the best place to put it is in
This fixed my issue. I've found a programs that have the same issue. Autogen and Text-gen-webui both have the same problem and solution. Hope this has helped! It's a lot easier than phazertech's guide imo. |
Very based. |
after almost half a year, i've decided to give it another try and was able to find my issue after writing this. https://gist.github.com/adeliktas/669812e64fd356afc4648ba847c61133
i did print all env vars with the
|
Sadly, this doesn't work for me with my AMD Radeon RX 7800 XT. Also, the file name is invoke.sh not invokeai.sh |
have you specified HSA_OVERRIDE_GFX_VERSION=11.0.0 as your gpu is 7XXX series? |
I finally got around to trying export HIP_VISIBLE_DEVICES="0" ... and nothing happened. Just as before, ::[uvicorn.access]::INFO --> 127.0.0.1:38998 - "GET /api/v1/queue/default/list HTTP/1.1" 200 |
@Alex9001 This error message makes me think it might not be an ROCm issue. Never the less, it might be worth double checking to make sure your ROCm HIP runtime is up to date. I'm assuming the ROCm runtime is in your |
Placing |
Is there an existing issue for this?
OS
Linux
GPU
amd
VRAM
8GB
What version did you experience this issue on?
3.0.0
What happened?
I tried to install via the automated installer and the manual installation. No matter what I try, when I click on the "Invoke" button on the web GUI, I get a segmentation fault :
Screenshots
No response
Additional context
Using ROCm 5.4.2, as recommended by the Pytorch official website.
GPU : AMD Radeon 6700 XT
Contact Details
No response
The text was updated successfully, but these errors were encountered: