feat: add NVIDIA Parakeet model support via sherpa-onnx#146
Closed
gabrielste1n wants to merge 13 commits intomainfrom
Closed
feat: add NVIDIA Parakeet model support via sherpa-onnx#146gabrielste1n wants to merge 13 commits intomainfrom
gabrielste1n wants to merge 13 commits intomainfrom
Conversation
Implement cross-platform support for NVIDIA's Parakeet TDT ASR models using sherpa-onnx runtime. Parakeet provides 50x faster transcription than Whisper with comparable accuracy. New features: - Add parakeet-tdt-0.6b-v2 (English) and v3 (25 languages) models - Create ParakeetServerManager for CLI-based transcription - Add sherpa-onnx binary download script for all platforms - Enable NVIDIA Parakeet tab in TranscriptionModelPicker UI - Support model download, delete, and selection Architecture follows existing patterns: - parakeet.js mirrors whisper.js for model management - parakeetServer.js uses sherpa-onnx-offline CLI - IPC handlers follow whisper handler patterns - useModelDownload hook extended for 'parakeet' type
…urbo model lookup failures
…t-cross-platform-l5JKr
…injection escaping
…d listener that never fired after awaited loadMainWindow()
Resolve conflicts in CHANGELOG.md, CLAUDE.md, electron-builder.json, and package.json by combining Parakeet/sherpa-onnx additions with Windows push-to-talk, custom dictionary, and shared download utilities from main.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add NVIDIA Parakeet speech-to-text support via sherpa-onnx as an alternative to whisper.cpp, along with several bug fixes for hotkey initialization and model management.
Changes
NVIDIA Parakeet Integration
parakeet.jsfor model management — downloading, verifying, and resolving bundled sherpa-onnx binaries per platform/archparakeetServer.jsas a sherpa-onnx CLI wrapper that spawns the transcription processparakeetWsServer.jsimplementing the sherpa-onnx WebSocket protocol for streaming audio to the serverffmpegUtils.jsfor audio format conversion (WebM/Opus → 16kHz mono PCM WAV) required by sherpa-onnxserverUtils.jswith port-finding and server lifecycle utilitiesscripts/download-sherpa-onnx.jsbuild script to fetch platform-specific sherpa-onnx binariesmodelDirUtils.jsfor shared model directory path resolutionparakeet-tdt-0.6b-v3, multilingual, ~680MB) in the centralized model registryUI & Settings
TranscriptionModelPicker.tsxto support selecting between Whisper and Parakeet models with separate download/status trackingparakeetModelvswhisperModel) so each engine retains its own selectionuseSettings.tsanduseModelDownload.tshooks for Parakeet model stateelectron.tsfor all new IPC APIsHotkey Fixes
did-finish-loadlistener was registered afterloadMainWindow()already resolvedModel Management Fixes
fileNamefield for model paths instead of deriving filenames from model namesmodelRegistryData.jsonas single source of truth, removing hardcoded model lists from multiple filesAudio Pipeline
audioManager.jsto route transcription to either Whisper or Parakeet based on the selected engine