Add TheRock ROCm tarball support for Stable Diffusion on Linux#1124
Add TheRock ROCm tarball support for Stable Diffusion on Linux#1124
Conversation
|
This approach is also intended to be used to scale for llamacpp later on as well. The idea will be to pull down ROCm separately from the artifacts for the tools. |
|
gfx1152 is missing from therock list. Some series (e.g RDNA 4) have -all- in the tarball name, e.g It wasn't clear for me from reading the code if it's handled |
Woah thanks! 1152 has problems as I understand, so I excluded it. There being some that are combined is unfortunate. |
- Add TheRock configuration to backend_versions.json with version 7.11.0
and support for 13 architectures (gfx908, gfx90a, gfx942, gfx1030,
gfx1031, gfx1032, gfx1100, gfx1101, gfx1102, gfx1151, gfx1150,
gfx1200, gfx1201)
- Implement system ROCm detection in backend_utils.cpp that checks
LD_LIBRARY_PATH and standard system paths (/opt/rocm, /usr/lib*)
for libamdhip64.so
- Add TheRock download and installation logic that fetches
architecture-specific tarballs from repo.amd.com/rocm/tarball/
and extracts to ~/.cache/lemonade/bin/therock/{arch}-{version}/
- Implement automatic cleanup of old TheRock versions to save disk space
- Update SD ROCm artifact version from master-506-1f30df9 to
master-505-e212912 and change filename format to include ROCm
version suffix (-rocm-7.11.0)
- Modify sd_server.cpp to prepend TheRock lib directory to
LD_LIBRARY_PATH when system ROCm is not available, enabling
stable diffusion to use downloaded ROCm libraries
Tested successfully on Strix Point (Radeon 890M, gfx1150) where
TheRock was automatically downloaded (2.3GB compressed, 13GB extracted)
and SD-Turbo generated images in ~12 seconds using ROCm/HIP acceleration.
53d92db to
cc089df
Compare
|
I updated Windows as well to 7.1.1, which I expect enables GFX1150 too now. |
|
What's going on with this one? It's been open for 3 weeks. |
|
This is what happened: https://youtu.be/AbSehcT19u0?si=S2Q8ikVpukD4U1wf Once system llamacpp support is landed this is my next step. |
Lollll and we’ll be breaking bad in no time |
Add TheRock configuration to backend_versions.json with version 7.11.0 and support for 13 architectures (gfx908, gfx90a, gfx942, gfx1030, gfx1031, gfx1032, gfx1100, gfx1101, gfx1102, gfx1151, gfx1150, gfx1200, gfx1201)
Implement system ROCm detection in backend_utils.cpp that checks LD_LIBRARY_PATH and standard system paths (/opt/rocm, /usr/lib*) for libamdhip64.so
Add TheRock download and installation logic that fetches architecture-specific tarballs from repo.amd.com/rocm/tarball/ and extracts to ~/.cache/lemonade/bin/therock/{arch}-{version}/
Implement automatic cleanup of old TheRock versions to save disk space
Update SD ROCm artifact version from master-506-1f30df9 to master-505-e212912 and change filename format to include ROCm version suffix (-rocm-7.11.0)
Modify sd_server.cpp to prepend TheRock lib directory to LD_LIBRARY_PATH when system ROCm is not available, enabling stable diffusion to use downloaded ROCm libraries
Tested successfully on Strix Point (Radeon 890M, gfx1150) where TheRock was automatically downloaded (2.3GB compressed, 13GB extracted) and SD-Turbo generated images in ~12 seconds using ROCm/HIP acceleration.