Device option sep by genrtul · Pull Request #1397 · lemonade-sdk/lemonade

genrtul · 2026-03-18T13:34:27Z

First off, I'm inexperienced with contributing code, so sorry if there are any mistakes, and I appreciate tips!

The purpose of this simple PR is to allow passing llama-cpp's --device option directly to lemonade-server. This option selects which accelerator devices should be used by lemonade. The option is renamed --llamacpp-device.

The motivation for this change is that currently when using JSON recipes to pass particular parameters to models, the --llamacpp_args option is used, which overrides this option if it gets passed on the command line. Now, choosing which particular devices to use for acceleration strikes me as a runtime option and not appropriate for including in a JSON recipe. This is a quick fix in lieu of somehow allowing --llamacpp_args to merge arguments from both the command line and recipes, which might be the better fix.

I have only implemented this for the llamacpp backend for now, as that's the only one I'm familiar with. However in principle is seems like it could be generic over backends and useful so long as a backend allows choosing accelerator devices. I don't know if the option would be relevant for the other ones. SD.cpp at least doesn't seem to expose such an option currently (https://github.com/leejet/stable-diffusion.cpp/blob/master/examples/cli/README.md). I'm certainly willing to try and add it for other backends, though I won't be able to test it.

The new option --llamacpp_device is relevant for systems with two or more GPUs (for example an integrated GPU and a discrete one, or one connected over Thunderbolt/USB4)

Example usage for a system with three devices:

(no flag) - default behavior which is seemingly to attempt to utilize all devices. This hasn't been changed in this PR.

lemonade-server serve --llamacpp rocm --llamacpp-device Rocm0 - Only the first rocm device (usually a GPU) will be used. Which particular device this is is system-dependent.

lemonade-server serve --llamacpp vulkan --llamacpp-device Vulkan0,Vulkan2 - The two vulkan devices Vulkan0 and Vulkan2 will be used.

Note that on my system, attempting to use two devices causes llama-server to repeatedly crash, but this is a separate bug.

src/cpp/server/backends/llamacpp_server.cpp

bitgamma · 2026-03-18T13:57:43Z

Thanks for the contribution! Left a couple of comments. This parameter needs to also be listed in the docs

superm1 · 2026-03-18T16:37:00Z

@jeremyfowers is about to overhaul the config system and this will get caught in the cross hairs. Can you hold off before doing more work and revisit after that's done? I don't want to see your work thrown away.

jeremyfowers

@genrtul can you help me understand the need for this feature better? if you want CPU, you can do --llamacpp cpu already. If you want GPU, you can do --llamacpp vulkan/rocm.

Why do we need a new flag?

genrtul · 2026-03-19T03:08:29Z

@genrtul can you help me understand the need for this feature better? if you want CPU, you can do --llamacpp cpu already. If you want GPU, you can do --llamacpp vulkan/rocm.

Why do we need a new flag?

This is for the case where someone has multiple GPUs available on the system, but only wants to use a subset of them, or a different set than is used by default. My personal case is I have a Strix Halo with an eGPU, and I use --device to prevent the eGPU from being used.

jeremyfowers · 2026-03-19T18:56:35Z

@genrtul can you help me understand the need for this feature better? if you want CPU, you can do --llamacpp cpu already. If you want GPU, you can do --llamacpp vulkan/rocm.
Why do we need a new flag?

This is for the case where someone has multiple GPUs available on the system, but only wants to use a subset of them, or a different set than is used by default. My personal case is I have a Strix Halo with an eGPU, and I use --device to prevent the eGPU from being used.

Thanks, that makes sense! So this is the GPU device ID number?

Would you kindly update your PR description to include a couple examples of usage, such as

default: llamacpp does X
--device Y: now you see llamacpp running on the iGPU
--device Z: now you see llamacpp running on the eGPU

genrtul · 2026-03-25T06:10:13Z

I've made the changes suggested by bitgamma and added the option in the docs where it seemed relevant. Taking into account superm1's comment I won't work on this further for now.

I've also added some examples to the OP

bitgamma

looks good to me!

Adam added 3 commits March 18, 2026 14:10

Separated out --device option for llamacpp backend

7dae5f4

Checked for empty

ea1f372

Change help doc to acknowledge commas

b566a6e

bitgamma requested changes Mar 18, 2026

View reviewed changes

src/cpp/server/backends/llamacpp_server.cpp Outdated Show resolved Hide resolved

src/cpp/server/backends/llamacpp_server.cpp Outdated Show resolved Hide resolved

jeremyfowers requested changes Mar 18, 2026

View reviewed changes

jaeiclee mentioned this pull request Mar 20, 2026

Unable to download rocm backend in v10 via lemonade-server recipes --install #1363

Open

Adam and others added 4 commits March 20, 2026 13:58

Docs changes

5b80656

Implement suggestions: always reserve flag and change flag name

2891d42

Merge branch 'lemonade-sdk:main' into device-option-sep

e192ec6

Fix flag name

0c1208d

bitgamma approved these changes Mar 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Device option sep#1397

Device option sep#1397
genrtul wants to merge 7 commits intolemonade-sdk:mainfrom
genrtul:device-option-sep

genrtul commented Mar 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

bitgamma commented Mar 18, 2026

Uh oh!

superm1 commented Mar 18, 2026

Uh oh!

jeremyfowers left a comment

Uh oh!

genrtul commented Mar 19, 2026

Uh oh!

jeremyfowers commented Mar 19, 2026

Uh oh!

genrtul commented Mar 25, 2026 •

edited

Loading

Uh oh!

bitgamma left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

genrtul commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bitgamma commented Mar 18, 2026

Uh oh!

superm1 commented Mar 18, 2026

Uh oh!

jeremyfowers left a comment

Choose a reason for hiding this comment

Uh oh!

genrtul commented Mar 19, 2026

Uh oh!

jeremyfowers commented Mar 19, 2026

Uh oh!

genrtul commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bitgamma left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

genrtul commented Mar 18, 2026 •

edited

Loading

genrtul commented Mar 25, 2026 •

edited

Loading