feat(windows): add Push-to-Talk support with compound hotkeys#153
Merged
gabrielste1n merged 14 commits intomainfrom Jan 27, 2026
Merged
feat(windows): add Push-to-Talk support with compound hotkeys#153gabrielste1n merged 14 commits intomainfrom
gabrielste1n merged 14 commits intomainfrom
Conversation
Adds full Push-to-Talk functionality on Windows with support for compound hotkeys like Ctrl+F11, Alt+F8, etc. ## Windows Push-to-Talk - Add native Windows keyboard hook (windows-key-listener.c) using Low-Level Keyboard Hook (WH_KEYBOARD_LL) for reliable KEY_DOWN/KEY_UP detection - Support compound hotkeys with modifiers (Ctrl, Alt, Shift) by tracking modifier key states with GetAsyncKeyState() - Add WindowsKeyManager.js to manage the native binary lifecycle - Emit KEY_UP when modifier keys are released while main key is held ## Hotkey Capture Fix - Stop Windows key listener when entering hotkey capture mode so it doesn't interfere with recording new hotkeys - Temporarily unregister globalShortcut during capture to prevent key event consumption - Restart listener when exiting capture mode if in push mode ## ELECTRON_RUN_AS_NODE Fix - Add run-electron.js wrapper script to unset ELECTRON_RUN_AS_NODE environment variable (inherited from Claude Code and similar tools) - Update dev:main script to use the wrapper ## Additional Improvements - Add auto-start at login feature with UI in Settings - Add debug logging for audio recording and whisper-server - Fix whisper-server PATH to find companion DLLs on Windows - Fix hotkeyManager to check isRegistered before returning success - Change dev server port from 5174 to 5080 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Custom Dictionary feature for improving transcription accuracy - New "Custom Dictionary" section in Settings sidebar - Words are passed as context hints to Whisper via initialPrompt - Supports adding phrases like "The word is Synty" for difficult words - Fix windows-key-listener.exe not bundled in production builds - Add windows-key-listener* to electron-builder extraResources filter Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…to-talk-compound-hotkeys # Conflicts: # src/helpers/audioManager.js
6 tasks
Resolved conflicts: - CLAUDE.md: Keep both Windows Push-to-Talk and GNOME Wayland sections - README.md: Keep all new features (Push-to-Talk, Custom Dictionary, GNOME Wayland) - windowManager.js: Keep DEV_SERVER_PORT constant over hardcoded port Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…yland, and hotkey capture
…inject into reasoning system prompts
…d tar.gz format Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ctrl+Shift+F11) for PTT activation--force/version pinningWindows Push-to-Talk
Enables hold-to-talk dictation on Windows via a native keyboard listener that detects key-up/down events (which Electron's
globalShortcutcannot do).windows-key-listener.exeFallback: If native listener unavailable, Windows falls back to tap-to-toggle mode.
Download Script Improvements
LLAMA_CPP_VERSION,WHISPER_CPP_VERSION,WINDOWS_KEY_LISTENER_VERSIONenv vars--forceflag to update existing binariesdownload-utils.js(removed ~110 lines of duplicate code)fetchJson()to prevent infinite loopsFiles Changed
New:
src/helpers/windowsKeyManager.js- Native listener process managerresources/windows-key-listener.c- Native C keyboard hookscripts/build-windows-key-listener.js- Local compilation fallbackscripts/download-windows-key-listener.js- Prebuilt binary downloadscripts/lib/download-utils.js- Shared download utilitiesModified:
main.js- WindowsKeyManager integrationsrc/helpers/windowManager.js- PTT availability trackingsrc/vite.config.mjs- Port alignment (5174)Test Plan
npm run dev- verify app loads at localhost:5174npm run download:whisper-cpp -- --current --forceCo-Authored-By: Craig von Chamier craig@cvcwebsolutions.com