add message about Linux FLM binary detection #1297

jeremyfowers · 2026-03-04T20:16:51Z

@bluefalcon13 can you confirm that FLM works in the docker whatsoever? That would be a pleasant surprise.

Here is the dockerfile for the lemonade server docker I am using: https://github.com/bluefalcon13/local_ai_stack/blob/main/configs/lemonade/Dockerfile

The docker compose is at the project root.

Can 100% confirm, after a bunch of fighting, I have a functional lemonade docker, with a custom llama-cpp and flm built. I need to bump my max LLMs so I can run them concurrently, then its more fighting to try to get FLM to act as a drafter. :D

What about the official docker released from this repo?

If you see where I am going with this: if we add a docker note to the website, people will think the built-in docker works with the NPU if they just do the one tip.

Any chance you want to update the mainline docker definition here to work with the NPU?

I might be able to. I ran into an issue with my Ubuntu docker (I am more familiar with Debian-based distros) cause I moved up to Arch's mainline kernel. Ubuntu did NOT play nice with that, and building XRT (and its plugin) from source requires the kernel headers. Shortly after that, I moved the container to Arch.

What about the official docker released from this repo?

How about we bundle FLM in that once it releases?

FWIW - I'm running natively in Arch myself using xrt and xdna-plugin that I uploaded MR's for.

You just need to build FLM, and there is an AUR for that too: https://aur.archlinux.org/packages/fastflowlm-git

Yeah, there is, but in a docker, it's almost the same as pulling source and adding some tweaks :P

I did pull in XRT and the plugin though from extra-testing. Those are super annoying to build.

Can you help push those out of testing? I'm new to arch packaging. I'm not sure what is needed for that to happen.

I have no idea how to do that either. Just looking over Arch's docs, and core-testing is pretty clear, but it doesn't seem like the rules are as strict for extra-testing > extra. https://wiki.archlinux.org/title/Official_repositories#extra-testing

They're migrated now.

What about the official docker released from this repo?

If you see where I am going with this: if we add a docker note to the website, people will think the built-in docker works with the NPU if they just do the one tip.

Any chance you want to update the mainline docker definition here to work with the NPU?

After much pain, I can confirm, yes it does:

root@30c2954fe628:/opt/lemonade# flm validate [Linux] Kernel: 7.0.0-rc2-1-mainline [Linux] NPU: /dev/accel/accel0 with 8 columns [Linux] NPU FW Version: 1.1.2.65 [Linux] amdxdna version: 0.6 [Linux] Memlock Limit: infinity root@30c2954fe628:/opt/lemonade#

I inserted the following in at line 67 of the Dockerfile. Never built a .deb before, but in theory, you could do that in a separate stage, and pull the .deb in and install it.

RUN apt update && apt install -y --no-install-recommends \ software-properties-common && add-apt-repository ppa:amd-team/xrt && \ apt update && apt install -y --no-install-recommends \ amdxdna-dkms build-essential cmake git g++ libavcodec-dev libavdevice-dev libavformat-dev \ libavutil-dev libboost-dev libboost-program-options-dev libcurl4-openssl-dev \ libdrm-dev libfftw3-dev libswscale-dev libxrt-dev libxrt-npu2 ninja-build \ uuid-dev && rm -fr /var/lib/apt/lists/* RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y RUN cd /opt && git clone --recursive https://github.com/FastFlowLM/FastFlowLM.git && \ cd /opt/FastFlowLM/src && cmake --preset linux-default -G Ninja \ -DCMAKE_BUILD_TYPE=Release && \ cmake --build build -j$(nproc) && \ cmake --install build

Additional verification:

root@30c2954fe628:/opt/lemonade# LEMONADE_FLM_LINUX_BETA=1 ./lemonade-server recipes Recipe Backend Status Message/Version Action ---------------------------------------------------------------------------------------------------------------------------------------------------- flm npu update_required Backend update is required before use. lemonade-server recipes --install flm:npu kokoro cpu installable Backend is supported but not installed. lemonade-server recipes --install kokoro:cpu llamacpp system unsupported llama-server not found in PATH - metal unsupported Requires macOS - vulkan installable Backend is supported but not installed. lemonade-server recipes --install llamacpp:vulkan rocm installable Backend is supported but not installed. lemonade-server recipes --install llamacpp:rocm cpu installable Backend is supported but not installed. lemonade-server recipes --install llamacpp:cpu ryzenai-llm npu unsupported Requires Windows - sd-cpp rocm installable Backend is supported but not installed. lemonade-server recipes --install sd-cpp:rocm cpu installable Backend is supported but not installed. lemonade-server recipes --install sd-cpp:cpu whispercpp npu unsupported Requires Windows - vulkan installable Backend is supported but not installed. lemonade-server recipes --install whispercpp:vulkan cpu installable Backend is supported but not installed. lemonade-server recipes --install whispercpp:cpu ---------------------------------------------------------------------------------------------------------------------------------------------------- root@30c2954fe628:/opt/lemonade#

docker run cmd used:

docker run -it --rm --device /dev/kfd --device /dev/dri --device /dev/accel/accel0 --ulimit memlock=-1:-1 --group-add $(getent group render | cut -d: -f3) --group-add $(getent group video | cut -d: -f3) --security-opt seccomp=unconfined --ipc=host lemonade:test bash

I did not run it myself, but that's cause I am currently already running it in my Arch container, and I am not sure I want to find out how graceful that handoff is!

-Original file line number
+Diff line change
@@ Expand Up / @@ -56,6 +56,14 @@ <h2>Background</h2> @@
               NOTE: This is a beta feature right now, and to use it you will need to
               run lemonade-server with the environment variable
               <code>LEMONADE_FLM_LINUX_BETA=1</code> set.
+              <br><br>
+              DOCKER USERS: The beta currently uses `which` to locate your flm binary.
+              If you do not have which in your docker, it will fail to pull models with
+              an error stating FLM cannot be installed automatically on linux.
+              <br><br>
+              Additionally, if you volume-mount your container so that `~/.cache/lemonade/hardware_info.json`
+              is saved between machine docker builds, you will need to delete the file on your host
+              so that it will rebuild a new one. This can prevent the FLM beta flag from being recognized.
             </p>
           </div>
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add message about Linux FLM binary detection #1297

Uh oh!

Diff view

Diff view

There are no files selected for viewing

jeremyfowers Mar 4, 2026

Uh oh!

bluefalcon13 Mar 4, 2026

Uh oh!

jeremyfowers Mar 4, 2026

Uh oh!

bluefalcon13 Mar 4, 2026

Uh oh!

superm1 Mar 5, 2026

Uh oh!

bluefalcon13 Mar 5, 2026

Uh oh!

superm1 Mar 5, 2026

Uh oh!

bluefalcon13 Mar 5, 2026

Uh oh!

superm1 Mar 5, 2026

Uh oh!

bluefalcon13 Mar 6, 2026

Uh oh!

add message about Linux FLM binary detection #1297

Are you sure you want to change the base?

Uh oh!

add message about Linux FLM binary detection #1297

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!