localai 3.10.0 #263441

BrewTestBot · 2026-01-18T23:11:30Z

Created by brew bump

Created with brew bump-formula-pr.

Details

release notes

# 🎉 LocalAI 3.10.0 Release! 🚀

LocalAI 3.10.0 is big on agent capabilities, multi-modal support, and cross-platform reliability.

We've added native Anthropic API support, launched a new Video Generation UI, introduced Open Responses API compatibility, and enhanced performance with a unified GPU backend system.

For a full tour, see below!

📌 TL;DR

Feature	Summary
Anthropic API Support	Fully compatible `/v1/messages` endpoint for seamless drop-in replacement of Claude.
Open Responses API	Native support for stateful agents with tool calling, streaming, background mode, and multi-turn conversations, passing all official acceptance tests.
Video & Image Generation Suite	New video gen UI + LTX-2 support for text-to-video and image-to-video.
Unified GPU Backends	GPU libraries (CUDA, ROCm, Vulkan) packaged inside backend containers — works out of the box on Nvidia, AMD, and ARM64 (Experimental).
Tool Streaming & XML Parsing	Full support for streaming tool calls and XML-formatted tool outputs.
System-Aware Backend Gallery	Only see backends your system can run (e.g., hide MLX on Linux).
Crash Fixes	Prevents crashes on AVX-only CPUs (Intel Sandy/Ivy Bridge) and fixes VRAM reporting on AMD GPUs.
Request Tracing	Debug agents & fine-tuning with memory-based request/response logging.
Moonshine Backend	Ultra-fast transcription engine for low-end devices.
Pocket-TTS	Lightweight, high-fidelity text-to-speech with voice cloning.
Vulkan arm64 builds	We now build backends and images for vulkan on arm64 as well

🚀 New Features & Major Enhancements

🤖 Open Responses API: Build Smarter, Autonomous Agents

LocalAI now supports the OpenAI Responses API, enabling powerful agentic workflows locally.

Stateful conversations via response_id — resume and manage long-running agent sessions.
Background mode: Run agents asynchronously and fetch results later.
Streaming support for tools, images, and audio.
Built-in tools: Web search, file search, and computer use (via MCP integrations).
Multi-turn interaction with dynamic context and tool use.

✅ Ideal for developers building agents that can browse, analyze files, or interact with systems — all on your local machine.

🔧 How to Use:

Set response_id in your request to maintain session state across calls.

Use background: true to run agents asynchronously.

Retrieve results via GET /api/v1/responses/{response_id}.

Enable streaming with stream: true to receive partial responses and tool calls in real time.

📌 Tip: Use response_id to build agent orchestration systems that persist context and avoid redundant computation.

Our support passes all the official acceptance tests:

🧠 Anthropic Messages API: Clone Claude Locally

LocalAI now fully supports the Anthropic messages API.

Use https://api.localai.host/v1/messages as a drop-in replacement for Claude.
Full tool/function calling support, just like OpenAI.
Streaming and non-streaming responses.
Compatible with anthropic-sdk-go, LangChain, and other tooling.

🔥 Perfect for teams migrating from Anthropic to local inference with full feature parity.

🎥 Video Generation: From Text to Video in the Web UI

New dedicated video generation page with intuitive controls.
LTX-2 is supported
Supports text-to-video and image-to-video workflows.
Built on top of diffusers with full compatibility.

📌 How to Use:

Go to /video in the web UI.

Enter a prompt (e.g., "A cat walking on a moonlit rooftop").

Optionally upload an image for image-to-video generation.

Adjust parameters like fps, num_frames, and guidance_scale.

⚙️ Unified GPU Backends: Acceleration Works Out of the Box

A major architectural upgrade: GPU libraries (CUDA, ROCm, Vulkan) are now packaged inside backend containers.

Single image: Now you don't need anymore to pull a specific image for your GPU. Any image works regardless if you have a GPU or not.
No more manual GPU driver setup — just run the image and get acceleration.
Works on Nvidia (CUDA), AMD (ROCm), and ARM64 (Vulkan).
Vulkan arm64 builds enabled
Reduced image complexity, faster builds, and consistent performance.

🚀 This means latest/master images now support GPU acceleration on all platforms — no extra config!

Note: this is experimental, please help us by filing an issue if something doesn't work!

🧩 Tool Streaming & Advanced Parsing

Enhance your agent workflows with richer tool interaction.

Streaming tool calls: Receive partial tool arguments in real time (e.g., input_json_delta).
XML-style tool call parsing: Models that return tools in XML format (<function>...</function>) are now properly parsed alongside text.
Works across all backends (llama.cpp, vLLM, diffusers, etc.).

💡 Enables more natural, real-time interaction with agents that use structured tool outputs.

🌐 System-Aware Backend Gallery: Only Compatible Backends Show

The backend gallery now shows only backends your system can run.

Auto-detects system capabilities (CPU, GPU, MLX, etc.).
Hides unsupported backends (e.g., MLX on Linux, CUDA on AMD).
Shows detected capabilities in the hero section.

🎤 New TTS Backends: Pocket-TTS

Add expressive voice generation to your apps with Pocket-TTS.

Real-time text-to-speech with voice cloning support (requires HF login).
Lightweight, fast, and open-source.
Available in the model gallery.

🗣️ Perfect for voice agents, narrators, or interactive assistants.
❗ Note: Voice cloning requires HF authentication and a registered voice model.

🔍 Request Tracing: Debug Your Agents

Trace requests and responses in memory — great for fine-tuning and agent debugging.

Enable via runtime setting or API.
Log stored in memory, dropped after max size.
Fetch logs via GET /api/v1/trace.
Export to JSON for analysis.

🪄 New 'Reasoning' Field: Extract Thinking Steps

LocalAI now automatically detects and extracts thinking tags from model output.

Supports both SSE and non-SSE modes.
Displays reasoning steps in the chat UI (under "Thinking" tab).
Fixes issue where thinking content appeared as part of final answer.

🚀 Moonshine Backend: Faster Transcription for Low-End Devices

Add Moonshine, an ONNX-based transcription engine, for fast, lightweight speech-to-text.

Optimized for low-end devices (Raspberry Pi, older laptops).
One of the fastest transcription engines available.
Supports live transcription.

🛠️ Fixes & Stability Improvements

🔧 Prevent BMI2 Crashes on AVX-Only CPUs

Fixed crashes on older Intel CPUs (Ivy Bridge, Sandy Bridge) that lack BMI2 instructions.

Now safely falls back to llama-cpp-fallback (SSE2 only).
No more EOF errors during model warmup.

✅ Ensures LocalAI runs smoothly on older hardware.

📊 Fix Swapped VRAM Usage on AMD GPUs

Correctly parses rocm-smi output: used and total VRAM are now displayed correctly.

Fixes misreported memory usage on dual-Radeon setups.
Handles HIP_VISIBLE_DEVICES properly (e.g., when using only discrete GPU).

🚀 The Complete Local Stack for Privacy-First AI

LocalAI	The free, Open Source OpenAI alternative. Drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required. Link: https://github.com/mudler/LocalAI
LocalAGI	Local AI agent management platform. Drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI. Link: https://github.com/mudler/LocalAGI
LocalRecall	RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Works alongside LocalAI and LocalAGI. Link: https://github.com/mudler/LocalRecall

❤️ Thank You

LocalAI is a true FOSS movement — built by contributors, powered by community.

If you believe in privacy-first AI:

✅ Star the repo
💬 Contribute code, docs, or feedback
📣 Share with others

Your support keeps this stack alive.

✅ Full Changelog

📋 Click to expand full changelog

What's Changed

Bug fixes :bug:

fix(ui): correctly parse import errors by @mudler in fix(ui): correctly parse import errors mudler/LocalAI#7726
fix(cli): import via CLI needs system state by @mudler in fix(cli): import via CLI needs system state mudler/LocalAI#7746
fix(amd-gpu): correctly show total and used vram by @mudler in fix(amd-gpu): correctly show total and used vram mudler/LocalAI#7761
fix: add nil checks before mergo.Merge to prevent panic in gallery model installation by @majiayu000 in fix: add nil checks before mergo.Merge to prevent panic in gallery model installation mudler/LocalAI#7785
fix: Usage for image generation is incorrect (and causes error in LiteLLM) by @majiayu000 in fix: Usage for image generation is incorrect (and causes error in LiteLLM) mudler/LocalAI#7786
fix: propagate validation errors by @majiayu000 in fix: propagate validation errors mudler/LocalAI#7787
fix: Failed to download checksums.txt when using launch to install localai by @majiayu000 in fix: Failed to download checksums.txt when using launch to install localai mudler/LocalAI#7788
fix(image-gen): fix scrolling issues by @mudler in fix(image-gen): fix scrolling issues mudler/LocalAI#7829
fix(llama.cpp/mmproj): fix loading mmproj in nested sub-dirs different from model path by @mudler in fix(llama.cpp/mmproj): fix loading mmproj in nested sub-dirs different from model path mudler/LocalAI#7832
fix: Prevent BMI2 instruction crash on AVX-only CPUs by @coffeerunhobby in fix: Prevent BMI2 instruction crash on AVX-only CPUs mudler/LocalAI#7817
fix: Highly inconsistent agent response to cogito agent calling MCP server - Body "Invalid http method" by @majiayu000 in fix: Highly inconsistent agent response to cogito agent calling MCP server - Body "Invalid http method" mudler/LocalAI#7790
fix(chat/ui): record model name in history for consistency by @mudler in fix(chat/ui): record model name in history for consistency mudler/LocalAI#7845
fix(ui): fix 404 on API menu link by pointing to index.html by @DEVMANISHOFFL in fix(ui): fix 404 on API menu link by pointing to index.html mudler/LocalAI#7878
fix: BMI2 crash on AVX-only CPUs (Intel Ivy Bridge/Sandy Bridge) by @coffeerunhobby in fix: BMI2 crash on AVX-only CPUs (Intel Ivy Bridge/Sandy Bridge) mudler/LocalAI#7864
fix(model): do not assume success when deleting a model process by @jroeber in fix(model): do not assume success when deleting a model process mudler/LocalAI#7963
fix(functions): do not duplicate function when valid JSON is inside XML tags by @mudler in fix(functions): do not duplicate function when valid JSON is inside XML tags mudler/LocalAI#8043

Exciting New Features 🎉

feat: disable force eviction by @mudler in feat: disable force eviction mudler/LocalAI#7725
feat(api): Allow tracing of requests and responses by @richiejp in feat(api): Allow tracing of requests and responses mudler/LocalAI#7609
feat(UI): image generation improvements by @mudler in feat(UI): image generation improvements mudler/LocalAI#7804
feat(image-gen/UI): move controls to the left, make the page more compact by @mudler in feat(image-gen/UI): move controls to the left, make the page more compact mudler/LocalAI#7823
feat(function): Add tool streaming, XML Tool Call Parsing Support by @mudler in feat(function): Add tool streaming, XML Tool Call Parsing Support mudler/LocalAI#7865
chore: Update to Ubuntu24.04 (cont #7423) by @richiejp in chore: Update to Ubuntu24.04 (cont #7423) mudler/LocalAI#7769
feat: package GPU libraries inside backend containers for unified base image by @Copilot in feat: package GPU libraries inside backend containers for unified base image mudler/LocalAI#7891
feat(backends): add moonshine backend for faster transcription by @mudler in feat(backends): add moonshine backend for faster transcription mudler/LocalAI#7833
feat: enable Vulkan arm64 image builds by @Copilot in feat: enable Vulkan arm64 image builds mudler/LocalAI#7912
feat: Add Anthropic Messages API support by @Copilot in feat: Add Anthropic Messages API support mudler/LocalAI#7948
feat: add tool/function calling support to Anthropic Messages API by @Copilot in feat: add tool/function calling support to Anthropic Messages API mudler/LocalAI#7956
feat(api): support 'reasoning' api field by @mudler in feat(api): support 'reasoning' api field mudler/LocalAI#7959
feat: Filter backend gallery by system capabilities by @Copilot in feat: Filter backend gallery by system capabilities mudler/LocalAI#7950
feat(tts): add pocket-tts backend by @mudler in feat(tts): add pocket-tts backend mudler/LocalAI#8018
feat(diffusers): add support to LTX-2 by @mudler in feat(diffusers): add support to LTX-2 mudler/LocalAI#8019
feat(ui): add video gen UI by @mudler in feat(ui): add video gen UI mudler/LocalAI#8020
feat(api): add support for open responses specification by @mudler in feat(api): add support for open responses specification mudler/LocalAI#8063

🧠 Models

chore(model gallery): :robot: add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent mudler/LocalAI#7801
chore(model gallery): :robot: add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent mudler/LocalAI#7807
chore(model gallery): :robot: add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent mudler/LocalAI#7816
Fix(gallery): Updated checksums for qwen3-vl-30b instruct & thinking by @Nold360 in Fix(gallery): Updated checksums for qwen3-vl-30b instruct & thinking mudler/LocalAI#7819
chore(model-gallery): :arrow_up: update checksum by @localai-bot in chore(model-gallery): ⬆️ update checksum mudler/LocalAI#7821
chore(model gallery): :robot: add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent mudler/LocalAI#7826
chore(model gallery): :robot: add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent mudler/LocalAI#7831
chore(model gallery): :robot: add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent mudler/LocalAI#7840
chore(model gallery): :robot: add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent mudler/LocalAI#7903
chore(model gallery): :robot: add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent mudler/LocalAI#7916
chore(model gallery): :robot: add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent mudler/LocalAI#7922
chore(model gallery): :robot: add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent mudler/LocalAI#7954
chore(model gallery): add qwen3-coder-30b-a3b-instruct based on model request by @rampa3 in chore(model gallery): add qwen3-coder-30b-a3b-instruct based on model request mudler/LocalAI#8082

📖 Documentation and examples

chore(AGENTS.md): Add section to help with building backends by @richiejp in chore(AGENTS.md): Add section to help with building backends mudler/LocalAI#7871
[gallery] add JSON schema for gallery model specification by @DEVMANISHOFFL in [gallery] add JSON schema for gallery model specification mudler/LocalAI#7890
chore(doc): put alert on install.sh until is fixed by @mudler in chore(doc): put alert on install.sh until is fixed mudler/LocalAI#8042

👒 Dependencies

chore(deps): bump securego/gosec from 2.22.9 to 2.22.11 by @dependabot[bot] in chore(deps): bump securego/gosec from 2.22.9 to 2.22.11 mudler/LocalAI#7774
chore(deps): bump google.golang.org/grpc from 1.77.0 to 1.78.0 by @dependabot[bot] in chore(deps): bump google.golang.org/grpc from 1.77.0 to 1.78.0 mudler/LocalAI#7777
chore(deps): bump github.com/schollz/progressbar/v3 from 3.18.0 to 3.19.0 by @dependabot[bot] in chore(deps): bump github.com/schollz/progressbar/v3 from 3.18.0 to 3.19.0 mudler/LocalAI#7775
chore(deps): bump github.com/modelcontextprotocol/go-sdk from 1.1.0 to 1.2.0 by @dependabot[bot] in chore(deps): bump github.com/modelcontextprotocol/go-sdk from 1.1.0 to 1.2.0 mudler/LocalAI#7776
chore(deps): bump dependabot/fetch-metadata from 2.4.0 to 2.5.0 by @dependabot[bot] in chore(deps): bump dependabot/fetch-metadata from 2.4.0 to 2.5.0 mudler/LocalAI#7876
chore(deps): bump github.com/labstack/echo/v4 from 4.14.0 to 4.15.0 by @dependabot[bot] in chore(deps): bump github.com/labstack/echo/v4 from 4.14.0 to 4.15.0 mudler/LocalAI#7875
chore(deps): bump protobuf from 6.33.2 to 6.33.4 in /backend/python/transformers by @dependabot[bot] in chore(deps): bump protobuf from 6.33.2 to 6.33.4 in /backend/python/transformers mudler/LocalAI#7993
chore(deps): bump github.com/mudler/go-processmanager from 0.0.0-20240820160718-8b802d3ecf82 to 0.1.0 by @dependabot[bot] in chore(deps): bump github.com/mudler/go-processmanager from 0.0.0-20240820160718-8b802d3ecf82 to 0.1.0 mudler/LocalAI#7992
chore(deps): bump github.com/onsi/gomega from 1.38.3 to 1.39.0 by @dependabot[bot] in chore(deps): bump github.com/onsi/gomega from 1.38.3 to 1.39.0 mudler/LocalAI#8000
chore(deps): bump github.com/gpustack/gguf-parser-go from 0.22.1 to 0.23.1 by @dependabot[bot] in chore(deps): bump github.com/gpustack/gguf-parser-go from 0.22.1 to 0.23.1 mudler/LocalAI#8001
chore(deps): bump fyne.io/fyne/v2 from 2.7.1 to 2.7.2 by @dependabot[bot] in chore(deps): bump fyne.io/fyne/v2 from 2.7.1 to 2.7.2 mudler/LocalAI#8003
chore(deps): bump github.com/onsi/ginkgo/v2 from 2.27.3 to 2.27.5 by @dependabot[bot] in chore(deps): bump github.com/onsi/ginkgo/v2 from 2.27.3 to 2.27.5 mudler/LocalAI#8004
chore(deps): bump torch from 2.3.1+cxx11.abi to 2.8.0 in /backend/python/rerankers in the pip group across 1 directory by @dependabot[bot] in chore(deps): bump torch from 2.3.1+cxx11.abi to 2.8.0 in /backend/python/rerankers in the pip group across 1 directory mudler/LocalAI#8066

Other Changes

docs: :arrow_up: update docs version mudler/LocalAI by @localai-bot in docs: ⬆️ update docs version mudler/LocalAI mudler/LocalAI#7716
chore: :arrow_up: Update ggml-org/whisper.cpp to 6114e692136bea917dc88a5eb2e532c3d133d963 by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to 6114e692136bea917dc88a5eb2e532c3d133d963 mudler/LocalAI#7717
chore: :arrow_up: Update ggml-org/llama.cpp to c18428423018ed214c004e6ecaedb0cbdda06805 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to c18428423018ed214c004e6ecaedb0cbdda06805 mudler/LocalAI#7718
chore: :arrow_up: Update ggml-org/llama.cpp to 85c40c9b02941ebf1add1469af75f1796d513ef4 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 85c40c9b02941ebf1add1469af75f1796d513ef4 mudler/LocalAI#7731
chore: :arrow_up: Update ggml-org/llama.cpp to 7ac8902133da6eb390c4d8368a7d252279123942 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 7ac8902133da6eb390c4d8368a7d252279123942 mudler/LocalAI#7740
chore: :arrow_up: Update ggml-org/llama.cpp to a4bf35889eda36d3597cd0f8f333f5b8a2fcaefc by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to a4bf35889eda36d3597cd0f8f333f5b8a2fcaefc mudler/LocalAI#7751
chore: :arrow_up: Update ggml-org/llama.cpp to 4ffc47cb2001e7d523f9ff525335bbe34b1a2858 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 4ffc47cb2001e7d523f9ff525335bbe34b1a2858 mudler/LocalAI#7760
chore(ci): be more precise when detecting existing models by @mudler in chore(ci): be more precise when detecting existing models mudler/LocalAI#7767
chore: ⬆️ Update leejet/stable-diffusion.cpp to 4ff2c8c74bd17c2cfffe3a01be77743fb3efba2f by @richiejp in chore: ⬆️ Update leejet/stable-diffusion.cpp to 4ff2c8c74bd17c2cfffe3a01be77743fb3efba2f mudler/LocalAI#7771
chore: :arrow_up: Update ggml-org/llama.cpp to c9a3b40d6578f2381a1373d10249403d58c3c5bd by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to c9a3b40d6578f2381a1373d10249403d58c3c5bd mudler/LocalAI#7778
Revert "chore(deps): bump securego/gosec from 2.22.9 to 2.22.11" by @mudler in Revert "chore(deps): bump securego/gosec from 2.22.9 to 2.22.11" mudler/LocalAI#7789
feat(swagger): update swagger by @localai-bot in feat(swagger): update swagger mudler/LocalAI#7794
chore: :arrow_up: Update ggml-org/llama.cpp to 0f89d2ecf14270f45f43c442e90ae433fd82dab1 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 0f89d2ecf14270f45f43c442e90ae433fd82dab1 mudler/LocalAI#7795
chore: :arrow_up: Update ggml-org/whisper.cpp to e9898ddfb908ffaa7026c66852a023889a5a7202 by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to e9898ddfb908ffaa7026c66852a023889a5a7202 mudler/LocalAI#7810
chore: :arrow_up: Update ggml-org/llama.cpp to 13814eb370d2f0b70e1830cc577b6155b17aee47 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 13814eb370d2f0b70e1830cc577b6155b17aee47 mudler/LocalAI#7809
feat(swagger): update swagger by @localai-bot in feat(swagger): update swagger mudler/LocalAI#7820
chore: :arrow_up: Update ggml-org/llama.cpp to ced765be44ce173c374f295b3c6f4175f8fd109b by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to ced765be44ce173c374f295b3c6f4175f8fd109b mudler/LocalAI#7822
chore: :arrow_up: Update ggml-org/llama.cpp to 706e3f93a60109a40f1224eaf4af0d59caa7c3ae by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 706e3f93a60109a40f1224eaf4af0d59caa7c3ae mudler/LocalAI#7836
feat(swagger): update swagger by @localai-bot in feat(swagger): update swagger mudler/LocalAI#7847
chore: :arrow_up: Update ggml-org/llama.cpp to e57f52334b2e8436a94f7e332462dfc63a08f995 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to e57f52334b2e8436a94f7e332462dfc63a08f995 mudler/LocalAI#7848
chore(Makefile): refactor common make targets by @mudler in chore(Makefile): refactor common make targets mudler/LocalAI#7858
chore: :arrow_up: Update leejet/stable-diffusion.cpp to b90b1ee9cf84ea48b478c674dd2ec6a33fd504d6 by @localai-bot in chore: ⬆️ Update leejet/stable-diffusion.cpp to b90b1ee9cf84ea48b478c674dd2ec6a33fd504d6 mudler/LocalAI#7862
chore: :arrow_up: Update ggml-org/llama.cpp to 4974bf53cf14073c7b66e1151348156aabd42cb8 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 4974bf53cf14073c7b66e1151348156aabd42cb8 mudler/LocalAI#7861
chore: :arrow_up: Update leejet/stable-diffusion.cpp to c5602a676caff5fe5a9f3b76b2bc614faf5121a5 by @localai-bot in chore: ⬆️ Update leejet/stable-diffusion.cpp to c5602a676caff5fe5a9f3b76b2bc614faf5121a5 mudler/LocalAI#7880
chore: :arrow_up: Update ggml-org/whisper.cpp to 679bdb53dbcbfb3e42685f50c7ff367949fd4d48 by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to 679bdb53dbcbfb3e42685f50c7ff367949fd4d48 mudler/LocalAI#7879
chore: :arrow_up: Update ggml-org/llama.cpp to e443fbcfa51a8a27b15f949397ab94b5e87b2450 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to e443fbcfa51a8a27b15f949397ab94b5e87b2450 mudler/LocalAI#7881
chore(image-ui): simplify interface by @mudler in chore(image-ui): simplify interface mudler/LocalAI#7882
chore(llama.cpp/flags): simplify conditionals by @mudler in chore(llama.cpp/flags): simplify conditionals mudler/LocalAI#7887
chore: :arrow_up: Update ggml-org/llama.cpp to ccbc84a5374bab7a01f68b129411772ddd8e7c79 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to ccbc84a5374bab7a01f68b129411772ddd8e7c79 mudler/LocalAI#7894
chore: :arrow_up: Update leejet/stable-diffusion.cpp to 9be0b91927dfa4007d053df72dea7302990226bb by @localai-bot in chore: ⬆️ Update leejet/stable-diffusion.cpp to 9be0b91927dfa4007d053df72dea7302990226bb mudler/LocalAI#7895
chore(dockerfile): drop driver-requirements section by @mudler in chore(dockerfile): drop driver-requirements section mudler/LocalAI#7907
chore(detection): detect GPU vendor from files present in the system by @mudler in chore(detection): detect GPU vendor from files present in the system mudler/LocalAI#7908
chore(ci): restore building of GPU vendor images by @mudler in chore(ci): restore building of GPU vendor images mudler/LocalAI#7910
chore(Dockerfile): restore GPU vendor specific sections by @mudler in chore(Dockerfile): restore GPU vendor specific sections mudler/LocalAI#7911
fix(intel): Add ARG for Ubuntu codename in Dockerfile by @mudler in fix(intel): Add ARG for Ubuntu codename in Dockerfile mudler/LocalAI#7917
chore: :arrow_up: Update ggml-org/llama.cpp to ae9f8df77882716b1702df2bed8919499e64cc28 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to ae9f8df77882716b1702df2bed8919499e64cc28 mudler/LocalAI#7915
chore(ci): use latest jetpack image for l4t by @mudler in chore(ci): use latest jetpack image for l4t mudler/LocalAI#7926
chore(l4t-12): do not use python 3.12 (wheels are only for 3.10) by @mudler in chore(l4t-12): do not use python 3.12 (wheels are only for 3.10) mudler/LocalAI#7928
chore(docs): Add Crush and VoxInput to the integrations by @richiejp in chore(docs): Add Crush and VoxInput to the integrations mudler/LocalAI#7924
Optimize GPU library copying to preserve symlinks and avoid duplicates by @Copilot in Optimize GPU library copying to preserve symlinks and avoid duplicates mudler/LocalAI#7931
chore(uv): add --index-strategy=unsafe-first-match to l4t by @mudler in chore(uv): add --index-strategy=unsafe-first-match to l4t mudler/LocalAI#7934
chore: :arrow_up: Update leejet/stable-diffusion.cpp to 0e52afc6513cc2dea9a1a017afc4a008d5acf2b0 by @localai-bot in chore: ⬆️ Update leejet/stable-diffusion.cpp to 0e52afc6513cc2dea9a1a017afc4a008d5acf2b0 mudler/LocalAI#7930
chore(ci): roll back l4t-cuda12 configurations by @mudler in chore(ci): roll back l4t-cuda12 configurations mudler/LocalAI#7935
Revert "chore(uv): add --index-strategy=unsafe-first-match to l4t" by @mudler in Revert "chore(uv): add --index-strategy=unsafe-first-match to l4t" mudler/LocalAI#7936
chore(deps): Bump llama.cpp to '480160d47297df43b43746294963476fc0a6e10f' by @mudler in chore(deps): Bump llama.cpp to '480160d47297df43b43746294963476fc0a6e10f' mudler/LocalAI#7933
chore(llama.cpp): propagate errors during model load by @mudler in chore(llama.cpp): propagate errors during model load mudler/LocalAI#7937
chore: :arrow_up: Update ggml-org/llama.cpp to 593da7fa49503b68f9f01700be9f508f1e528992 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 593da7fa49503b68f9f01700be9f508f1e528992 mudler/LocalAI#7946
feat(swagger): update swagger by @localai-bot in feat(swagger): update swagger mudler/LocalAI#7964
chore: :arrow_up: Update ggml-org/llama.cpp to b1377188784f9aea26b8abde56d4aee8c733eec7 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to b1377188784f9aea26b8abde56d4aee8c733eec7 mudler/LocalAI#7965
fix(l4t-12): use pip to install python deps by @mudler in fix(l4t-12): use pip to install python deps mudler/LocalAI#7967
chore: :arrow_up: Update ggml-org/llama.cpp to 0c3b7a9efebc73d206421c99b7eb6b6716231322 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 0c3b7a9efebc73d206421c99b7eb6b6716231322 mudler/LocalAI#7978
chore: :arrow_up: Update leejet/stable-diffusion.cpp to 885e62ea822e674c6837a8225d2d75f021b97a6a by @localai-bot in chore: ⬆️ Update leejet/stable-diffusion.cpp to 885e62ea822e674c6837a8225d2d75f021b97a6a mudler/LocalAI#7979
chore(backends): do not bundle cuda target directory by @mudler in chore(backends): do not bundle cuda target directory mudler/LocalAI#7982
chore(vulkan): bump vulkan-sdk to 1.4.335.0 by @mudler in chore(vulkan): bump vulkan-sdk to 1.4.335.0 mudler/LocalAI#7981
chore: :arrow_up: Update ggml-org/llama.cpp to bcf7546160982f56bc290d2e538544bbc0772f63 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to bcf7546160982f56bc290d2e538544bbc0772f63 mudler/LocalAI#7991
chore: :arrow_up: Update leejet/stable-diffusion.cpp to 7010bb4dff7bd55b03d35ef9772142c21699eba9 by @localai-bot in chore: ⬆️ Update leejet/stable-diffusion.cpp to 7010bb4dff7bd55b03d35ef9772142c21699eba9 mudler/LocalAI#8013
chore: :arrow_up: Update ggml-org/whisper.cpp to a96310871a3b294f026c3bcad4e715d17b5905fe by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to a96310871a3b294f026c3bcad4e715d17b5905fe mudler/LocalAI#8014
chore: :arrow_up: Update ggml-org/llama.cpp to e4832e3ae4d58ac0ecbdbf4ae055424d6e628c9f by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to e4832e3ae4d58ac0ecbdbf4ae055424d6e628c9f mudler/LocalAI#8015
chore: :arrow_up: Update ggml-org/whisper.cpp to 47af2fb70f7e4ee1ba40c8bed513760fdfe7a704 by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to 47af2fb70f7e4ee1ba40c8bed513760fdfe7a704 mudler/LocalAI#8039
chore: :arrow_up: Update ggml-org/llama.cpp to d98b548120eecf98f0f6eaa1ba7e29b3afda9f2e by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to d98b548120eecf98f0f6eaa1ba7e29b3afda9f2e mudler/LocalAI#8040
fix: reduce log verbosity for /api/operations polling by @Divyanshupandey007 in fix: reduce log verbosity for /api/operations polling mudler/LocalAI#8050
chore: :arrow_up: Update ggml-org/whisper.cpp to 2eeeba56e9edd762b4b38467bab96c2517163158 by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to 2eeeba56e9edd762b4b38467bab96c2517163158 mudler/LocalAI#8052
chore: :arrow_up: Update ggml-org/llama.cpp to 785a71008573e2d84728fb0ba9e851d72d3f8fab by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 785a71008573e2d84728fb0ba9e851d72d3f8fab mudler/LocalAI#8053
fix(ci): use more beefy runner for expensive jobs by @mudler in fix(ci): use more beefy runner for expensive jobs mudler/LocalAI#8065
Revert "chore(deps): bump torch from 2.3.1+cxx11.abi to 2.8.0 in /backend/python/rerankers in the pip group across 1 directory" by @mudler in Revert "chore(deps): bump torch from 2.3.1+cxx11.abi to 2.8.0 in /backend/python/rerankers in the pip group across 1 directory" mudler/LocalAI#8072
chore: :arrow_up: Update ggml-org/llama.cpp to 388ce822415f24c60fcf164a321455f1e008cafb by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 388ce822415f24c60fcf164a321455f1e008cafb mudler/LocalAI#8073
chore: :arrow_up: Update ggml-org/whisper.cpp to f53dc74843e97f19f94a79241357f74ad5b691a6 by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to f53dc74843e97f19f94a79241357f74ad5b691a6 mudler/LocalAI#8074
chore(ui): add video generation link by @mudler in chore(ui): add video generation link mudler/LocalAI#8079
chore: :arrow_up: Update ggml-org/llama.cpp to 2fbde785bc106ae1c4102b0e82b9b41d9c466579 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 2fbde785bc106ae1c4102b0e82b9b41d9c466579 mudler/LocalAI#8087
chore: :arrow_up: Update leejet/stable-diffusion.cpp to 9565c7f6bd5fcff124c589147b2621244f2c4aa1 by @localai-bot in chore: ⬆️ Update leejet/stable-diffusion.cpp to 9565c7f6bd5fcff124c589147b2621244f2c4aa1 mudler/LocalAI#8086

New Contributors

@majiayu000 made their first contribution in fix: add nil checks before mergo.Merge to prevent panic in gallery model installation mudler/LocalAI#7785
@coffeerunhobby made their first contribution in fix: Prevent BMI2 instruction crash on AVX-only CPUs mudler/LocalAI#7817
@DEVMANISHOFFL made their first contribution in fix(ui): fix 404 on API menu link by pointing to index.html mudler/LocalAI#7878
@jroeber made their first contribution in fix(model): do not assume success when deleting a model process mudler/LocalAI#7963
@Divyanshupandey007 made their first contribution in fix: reduce log verbosity for /api/operations polling mudler/LocalAI#8050

Full Changelog: mudler/LocalAI@v3.9.0...v3.10.0

View the full release notes at https://github.com/mudler/LocalAI/releases/tag/v3.10.0.

github-actions · 2026-01-18T23:31:39Z

🤖 An automated task has requested bottles to be published to this PR.

Caution

Please do not push to this PR branch before the bottle commits have been pushed, as this results in a state that is difficult to recover from. If you need to resolve a merge conflict, please use a merge commit. Do not force-push to this PR branch.

localai 3.10.0

8598eb7

github-actions bot added go Go use is a significant feature of the PR or issue bump-formula-pr PR was created using `brew bump-formula-pr` labels Jan 18, 2026

chenrui333 approved these changes Jan 18, 2026

View reviewed changes

localai: update 3.10.0 bottle.

a686b05

github-actions bot added the CI-published-bottle-commits The commits for the built bottles have been pushed to the PR branch. label Jan 18, 2026

github-actions bot approved these changes Jan 18, 2026

View reviewed changes

BrewTestBot enabled auto-merge January 18, 2026 23:33

BrewTestBot added this pull request to the merge queue Jan 18, 2026

Merged via the queue into main with commit 44abfe9 Jan 18, 2026
22 checks passed

BrewTestBot deleted the bump-localai-3.10.0 branch January 18, 2026 23:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

localai 3.10.0 #263441

localai 3.10.0 #263441

BrewTestBot commented Jan 18, 2026

LocalAI

LocalAGI

LocalRecall

Uh oh!

github-actions bot commented Jan 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

localai 3.10.0 #263441

localai 3.10.0 #263441

Conversation

BrewTestBot commented Jan 18, 2026

📌 TL;DR

🚀 New Features & Major Enhancements

🤖 Open Responses API: Build Smarter, Autonomous Agents

🧠 Anthropic Messages API: Clone Claude Locally

🎥 Video Generation: From Text to Video in the Web UI

⚙️ Unified GPU Backends: Acceleration Works Out of the Box

🧩 Tool Streaming & Advanced Parsing

🌐 System-Aware Backend Gallery: Only Compatible Backends Show

🎤 New TTS Backends: Pocket-TTS

🔍 Request Tracing: Debug Your Agents

🪄 New 'Reasoning' Field: Extract Thinking Steps

🚀 Moonshine Backend: Faster Transcription for Low-End Devices

🛠️ Fixes & Stability Improvements

🔧 Prevent BMI2 Crashes on AVX-Only CPUs

📊 Fix Swapped VRAM Usage on AMD GPUs

🚀 The Complete Local Stack for Privacy-First AI

LocalAI

LocalAGI

LocalRecall

❤️ Thank You

✅ Full Changelog

What's Changed

Bug fixes :bug:

Exciting New Features 🎉

🧠 Models

📖 Documentation and examples

👒 Dependencies

Other Changes

New Contributors

Uh oh!

github-actions bot commented Jan 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants