Expose GGUF audio support in gguf_lib#6
Conversation
Wire the llama audio bridge through gguf_lib JNI/Kotlin surfaces and make the Android packaging/build configuration work for both arm64-v8a and x86_64. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR extends the gguf_lib Android/JNI bridge to support VLM (vision/audio) projector loading and VLM streaming generation, and adds a runtime toggle for “thinking” blocks in chat templates. It also updates the native build to use a configurable llama.cpp-android checkout, compiles VLM engine sources, and enables x86_64 builds.
Changes:
- Add JNI + Kotlin APIs for VLM projector lifecycle (load/release/info/default marker) and VLM streaming generation with media payloads.
- Add a “thinking enabled” flag that is passed into chat template application.
- Update native build configuration (configurable
LLAMA_DIR, VLM sources, and x86_64 ABI support with KleidiAI disabled on non-arm64).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| gguf_lib/src/main/java/com/dark/gguf_lib/GGUFNativeLib.kt | Adds JNI declarations for thinking toggle + VLM projector and VLM generation APIs. |
| gguf_lib/src/main/java/com/dark/gguf_lib/GGMLEngine.kt | Exposes higher-level Kotlin wrappers for thinking toggle, VLM projector management, and a VLM Flow generator. |
| gguf_lib/src/main/cpp/gguf_lib.cpp | Implements native thinking flag wiring + VLM projector loading, media ingestion, and VLM streaming generation. |
| gguf_lib/src/main/cpp/CMakeLists.txt | Makes LLAMA_DIR configurable, conditionally configures KleidiAI, and builds/links VLM engine sources. |
| gguf_lib/build.gradle.kts | Expands ABI filters to include x86_64. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 25c50c10d3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Prefix only missing media markers, lock projector state reads behind the generation mutex, and drop the unused VLM bridge sync helper. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Hey man closing this PR, as the STT and TTS support is added via onnx not gguf, as gguf is very memory heavy |
Summary
gguf_libJNI/Kotlin APIsarm64-v8aandx86_64Dependency