Skip to content

Expose GGUF audio support in gguf_lib#6

Closed
Godzilla675 wants to merge 2 commits into
Siddhesh2377:masterfrom
Godzilla675:feature/gguf-audio-sdk
Closed

Expose GGUF audio support in gguf_lib#6
Godzilla675 wants to merge 2 commits into
Siddhesh2377:masterfrom
Godzilla675:feature/gguf-audio-sdk

Conversation

@Godzilla675
Copy link
Copy Markdown

@Godzilla675 Godzilla675 commented Mar 12, 2026

Summary

  • wire the new llama audio bridge through gguf_lib JNI/Kotlin APIs
  • rebuild packaging so the shipped AAR works for both arm64-v8a and x86_64
  • unblock ToolNeuron's end-to-end GGUF audio flow

Dependency

Wire the llama audio bridge through gguf_lib JNI/Kotlin surfaces and make the Android packaging/build configuration work for both arm64-v8a and x86_64.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the gguf_lib Android/JNI bridge to support VLM (vision/audio) projector loading and VLM streaming generation, and adds a runtime toggle for “thinking” blocks in chat templates. It also updates the native build to use a configurable llama.cpp-android checkout, compiles VLM engine sources, and enables x86_64 builds.

Changes:

  • Add JNI + Kotlin APIs for VLM projector lifecycle (load/release/info/default marker) and VLM streaming generation with media payloads.
  • Add a “thinking enabled” flag that is passed into chat template application.
  • Update native build configuration (configurable LLAMA_DIR, VLM sources, and x86_64 ABI support with KleidiAI disabled on non-arm64).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
gguf_lib/src/main/java/com/dark/gguf_lib/GGUFNativeLib.kt Adds JNI declarations for thinking toggle + VLM projector and VLM generation APIs.
gguf_lib/src/main/java/com/dark/gguf_lib/GGMLEngine.kt Exposes higher-level Kotlin wrappers for thinking toggle, VLM projector management, and a VLM Flow generator.
gguf_lib/src/main/cpp/gguf_lib.cpp Implements native thinking flag wiring + VLM projector loading, media ingestion, and VLM streaming generation.
gguf_lib/src/main/cpp/CMakeLists.txt Makes LLAMA_DIR configurable, conditionally configures KleidiAI, and builds/links VLM engine sources.
gguf_lib/build.gradle.kts Expands ABI filters to include x86_64.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread gguf_lib/src/main/cpp/gguf_lib.cpp Outdated
Comment thread gguf_lib/src/main/cpp/gguf_lib.cpp
Comment thread gguf_lib/src/main/cpp/gguf_lib.cpp
Comment thread gguf_lib/src/main/cpp/gguf_lib.cpp Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 25c50c10d3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread gguf_lib/src/main/cpp/gguf_lib.cpp Outdated
Prefix only missing media markers, lock projector state reads behind the generation mutex, and drop the unused VLM bridge sync helper.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Siddhesh2377
Copy link
Copy Markdown
Owner

Hey man closing this PR, as the STT and TTS support is added via onnx not gguf, as gguf is very memory heavy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants