Skip to content

feat(windows): add Push-to-Talk support with compound hotkeys#153

Merged
gabrielste1n merged 14 commits intomainfrom
feature/windows-push-to-talk-compound-hotkeys
Jan 27, 2026
Merged

feat(windows): add Push-to-Talk support with compound hotkeys#153
gabrielste1n merged 14 commits intomainfrom
feature/windows-push-to-talk-compound-hotkeys

Conversation

@gabrielste1n
Copy link
Copy Markdown
Collaborator

Summary

  • Add Push-to-Talk (hold-to-talk) mode for Windows using a native keyboard hook
  • Support compound hotkeys (e.g., Ctrl+Shift+F11) for PTT activation
  • Consolidate download scripts with shared utilities and add --force/version pinning

Windows Push-to-Talk

Enables hold-to-talk dictation on Windows via a native keyboard listener that detects key-up/down events (which Electron's globalShortcut cannot do).

Platform PTT Mode Implementation
Windows Hold-to-talk Native windows-key-listener.exe
macOS/Linux Tap-to-toggle Electron globalShortcut (unchanged)

Fallback: If native listener unavailable, Windows falls back to tap-to-toggle mode.

Download Script Improvements

  • Dynamic releases: Fetch latest release from GitHub API instead of hardcoded versions
  • Version pinning: LLAMA_CPP_VERSION, WHISPER_CPP_VERSION, WINDOWS_KEY_LISTENER_VERSION env vars
  • Force re-download: --force flag to update existing binaries
  • Shared utilities: Consolidated download-utils.js (removed ~110 lines of duplicate code)
  • Bug fix: Added redirect limit to fetchJson() to prevent infinite loops

Files Changed

New:

  • src/helpers/windowsKeyManager.js - Native listener process manager
  • resources/windows-key-listener.c - Native C keyboard hook
  • scripts/build-windows-key-listener.js - Local compilation fallback
  • scripts/download-windows-key-listener.js - Prebuilt binary download
  • scripts/lib/download-utils.js - Shared download utilities

Modified:

  • main.js - WindowsKeyManager integration
  • src/helpers/windowManager.js - PTT availability tracking
  • src/vite.config.mjs - Port alignment (5174)
  • Download scripts refactored to use shared utils

Test Plan

  • Windows: Test hold-to-talk with various hotkeys
  • Windows: Verify fallback to tap mode when binary unavailable
  • macOS/Linux: Confirm no regression in tap-to-toggle mode
  • Run npm run dev - verify app loads at localhost:5174
  • Run npm run download:whisper-cpp -- --current --force

Co-Authored-By: Craig von Chamier craig@cvcwebsolutions.com

craigvc and others added 6 commits January 25, 2026 13:37
Adds full Push-to-Talk functionality on Windows with support for compound
hotkeys like Ctrl+F11, Alt+F8, etc.

## Windows Push-to-Talk

- Add native Windows keyboard hook (windows-key-listener.c) using
  Low-Level Keyboard Hook (WH_KEYBOARD_LL) for reliable KEY_DOWN/KEY_UP
  detection
- Support compound hotkeys with modifiers (Ctrl, Alt, Shift) by tracking
  modifier key states with GetAsyncKeyState()
- Add WindowsKeyManager.js to manage the native binary lifecycle
- Emit KEY_UP when modifier keys are released while main key is held

## Hotkey Capture Fix

- Stop Windows key listener when entering hotkey capture mode so it
  doesn't interfere with recording new hotkeys
- Temporarily unregister globalShortcut during capture to prevent
  key event consumption
- Restart listener when exiting capture mode if in push mode

## ELECTRON_RUN_AS_NODE Fix

- Add run-electron.js wrapper script to unset ELECTRON_RUN_AS_NODE
  environment variable (inherited from Claude Code and similar tools)
- Update dev:main script to use the wrapper

## Additional Improvements

- Add auto-start at login feature with UI in Settings
- Add debug logging for audio recording and whisper-server
- Fix whisper-server PATH to find companion DLLs on Windows
- Fix hotkeyManager to check isRegistered before returning success
- Change dev server port from 5174 to 5080

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Custom Dictionary feature for improving transcription accuracy
  - New "Custom Dictionary" section in Settings sidebar
  - Words are passed as context hints to Whisper via initialPrompt
  - Supports adding phrases like "The word is Synty" for difficult words
- Fix windows-key-listener.exe not bundled in production builds
  - Add windows-key-listener* to electron-builder extraResources filter

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…to-talk-compound-hotkeys

# Conflicts:
#	src/helpers/audioManager.js
gabrielste1n and others added 8 commits January 26, 2026 18:20
Resolved conflicts:
- CLAUDE.md: Keep both Windows Push-to-Talk and GNOME Wayland sections
- README.md: Keep all new features (Push-to-Talk, Custom Dictionary, GNOME Wayland)
- windowManager.js: Keep DEV_SERVER_PORT constant over hardcoded port

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…d tar.gz format

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@gabrielste1n gabrielste1n merged commit e3969d0 into main Jan 27, 2026
0 of 4 checks passed
@gabrielste1n gabrielste1n deleted the feature/windows-push-to-talk-compound-hotkeys branch January 27, 2026 05:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants