Skip to content

Add GGUF audio and microphone transcription support#88

Open
Godzilla675 wants to merge 4 commits intoSiddhesh2377:re-writefrom
Godzilla675:Fix-whisper-initial-download-issue
Open

Add GGUF audio and microphone transcription support#88
Godzilla675 wants to merge 4 commits intoSiddhesh2377:re-writefrom
Godzilla675:Fix-whisper-initial-download-issue

Conversation

@Godzilla675
Copy link
Contributor

@Godzilla675 Godzilla675 commented Mar 10, 2026

Summary

  • keep the GGUF filtering and quant parsing fixes from the original PR and add the trailing-descriptor regression test
  • add projector sidecar download, pairing, load, unload, and delete handling for multimodal/audio GGUF models
  • surface both file-based and in-app microphone transcription in chat through the rebuilt dual-ABI gguf_lib AAR
  • add staged microphone UX: tap to start, tap to stop, review/edit the prompt, then send through the existing GGUF audio path

Dependencies

Validation

  • ./gradlew --no-daemon --no-configuration-cache --max-workers=1 -Dorg.gradle.jvmargs='-Xmx2g -XX:MaxMetaspaceSize=512m -Dfile.encoding=UTF-8' -Pksp.incremental=false :app:testDebugUnitTest --tests com.dark.tool_neuron.repo.ModelStoreRepositoryTest
  • ./gradlew --no-daemon --no-configuration-cache --max-workers=1 -Dorg.gradle.jvmargs='-Xmx2g -XX:MaxMetaspaceSize=512m -Dfile.encoding=UTF-8' -Pksp.incremental=false :app:assembleDebug
  • closes Download of Whisper-EN-Small fails #57

Copilot AI review requested due to automatic review settings March 10, 2026 19:33
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors GGUF filename handling in ModelStoreRepository to improve model file filtering, quantization parsing, and model ID generation, and adds unit tests to validate the new helper logic.

Changes:

  • Added centralized GGUF helpers (isSupportedGgufFile, stripGgufSuffix, extractQuantType) and updated GGUF listing logic to use them.
  • Bumped model store cache version to invalidate stale cached listings after filtering/parsing changes.
  • Added ModelStoreRepositoryTest to cover GGUF extension handling, projection artifact filtering, suffix stripping, and quant parsing.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
app/src/main/java/com/dark/tool_neuron/repo/ModelStoreRepository.kt Introduces helper methods for GGUF filtering/quant parsing, updates model listing logic, bumps cache version.
app/src/test/java/com/dark/tool_neuron/repo/ModelStoreRepositoryTest.kt Adds unit tests for the new GGUF helper behaviors and edge cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Siddhesh2377
Copy link
Owner

Hey @Godzilla675 Till now we don't support Audio GGUF

@Godzilla675
Copy link
Contributor Author

Hmm ok ill see another way to fix the issue.

@Godzilla675
Copy link
Contributor Author

@Siddhesh2377 can I implement gguf audio model support?

@Godzilla675 Godzilla675 marked this pull request as draft March 10, 2026 20:17
@Siddhesh2377
Copy link
Owner

Yes @Godzilla675
First add it in the custom llama.cpp repo
Then call it in gguf_lib inside ai systems repo
Then call it in tool neuron okay
Please fallow this pattern

@Godzilla675
Copy link
Contributor Author

Ok

@Godzilla675
Copy link
Contributor Author

@Siddhesh2377 I noticed that there is no mic transcription-based support in the app while working on the audio gguf support. shall I add it while I'm working?

@Godzilla675 Godzilla675 changed the title Fix whisper initial download issue Add GGUF audio model support and fix Whisper download flow Mar 12, 2026
@Godzilla675 Godzilla675 marked this pull request as ready for review March 12, 2026 20:04
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 55262cc964

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@Siddhesh2377
Copy link
Owner

Hey @Godzilla675 Yes u can add it for sure !

- add RECORD_AUDIO permission and a MediaRecorder-based chat recorder
- keep file import as a fallback while staging recorded clips before send
- route microphone audio through the existing GGUF transcription path

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Godzilla675 Godzilla675 changed the title Add GGUF audio model support and fix Whisper download flow Add GGUF audio and microphone transcription support Mar 13, 2026
@Godzilla675
Copy link
Contributor Author

Godzilla675 commented Mar 13, 2026

@Siddhesh2377 I finished the implementation if you want to review

@Godzilla675
Copy link
Contributor Author

@Siddhesh2377 do I fix the merge conflicts or do I wait a bit until you finish the changes you are currently doing?

@Siddhesh2377
Copy link
Owner

Hey @Godzilla675
I would say solve the conflicts and can u send me a working apk file,
Also if u have discord then please join the grp or dm me, as it is a easy platform for communication

@Godzilla675
Copy link
Contributor Author

@Siddhesh2377 ok, where is the discord though? can you send me the link?

@Siddhesh2377
Copy link
Owner

Yes, make a working apk release on your fork and send me the link on discord
.https://discord.gg/V9vm9cwnw

@Godzilla675
Copy link
Contributor Author

ok.

Godzilla675 and others added 3 commits March 14, 2026 23:55
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restore GGUF projector/audio integration after the re-write merge

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Recognize mmjproj as a projector marker alongside mmproj, vision-adapter, projector
- Score mmjproj candidates in sidecar auto-download selection
- Broaden user-facing projector readiness message to mmproj/mmjproj
- Add unit tests for mmjproj filtering and case-insensitive detection
@Godzilla675
Copy link
Contributor Author

@Siddhesh2377 i added mmproj support. you can download the apk here https://github.com/Godzilla675/ToolNeuron/releases/tag/toolneuron-fix-whisper-test-apk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Download of Whisper-EN-Small fails

3 participants