Privacy-First Voice-to-Text with AI Enhancement for macOS
- Two speech recognition engines:
- Parakeet (default) - Fast and accurate using Apple Neural Engine, 25 European languages
- WhisperKit - Multiple model sizes (216MB to 955MB) for different accuracy/speed trade-offs
- Support for multiple languages with auto-detection
- Real-time waveform visualization during recording
- Audio file transcription - Import MP3, WAV, M4A, FLAC, and more
- Local LLM processing for grammar correction and text improvement
- Multiple AI models including Gemma, Llama, Qwen, and Mistral
- Custom prompts for different use cases:
- Grammar fixing and email formatting
- Language translation
- Custom text processing workflows
- 100% local processing - your voice never leaves your device
- No cloud services, no data collection
- Open source - audit the code yourself
- Secure sandboxed environment
- Live meeting transcription with dual-channel audio (microphone + system audio)
- Automatic speaker separation β "Me" vs "Others" with accurate timestamps
- AI-generated summaries, action items, decisions, and follow-ups
- Post-meeting Q&A β ask questions about any meeting and get AI-powered answers
- Auto-detection of Zoom, Teams, Google Meet, Webex, Slack, Discord, and FaceTime
- Configurable meeting hotkey (Control+M by default) to start/stop recording
- Export meetings as Markdown or copy transcripts and summaries
- Beautiful detail view with Summary, Transcript, Actions, and Q&A tabs
- Global hotkey support (Option+Space by default)
- Hold to Talk mode - hold hotkey to record, release to stop
- Auto-copy to clipboard
- Auto-paste functionality
- Auto-enter for instant message sending
- Menu bar integration with background operation (runs without a visible window)
- Start minimized option
- Auto-stop recording after 10 minutes
- Transcription history - Browse and search past transcriptions
- Beautiful dark-themed interface with modern sidebar navigation
- Real-time recording visualization with animated effects
- Recording overlay with customizable position
- Comprehensive onboarding guide
- Easy model management and downloads
- Customizable shortcuts and prompts
- Drag-and-drop audio file support
- Apple Silicon Mac (M1, M2, M3, or later)
- macOS 14.0 or later
- 20GB free disk space (for AI models)
- Microphone access permission
- Accessibility permissions (for global hotkeys)
- Apple Events permissions (for clipboard operations)
- Visit whisperclip.com
- Download the latest release
- Drag WhisperClip.app to your Applications folder
- Follow the setup guide for permissions
# Clone the repository
git clone https://github.com/cydanix/whisperclip.git
cd whisperclip
# Build the app
./build.sh
# For development
./local_build.sh Debug
./local_run.sh Debug- Launch WhisperClip from Applications or menu bar
- Grant permissions when prompted (microphone, accessibility)
- Download AI models through the setup guide
- Press Option+Space (or click Record) to start recording
- Press again to stop - text will be automatically copied to clipboard
- Change hotkey: Settings β Hotkey preferences
- Add custom prompts: Settings β Prompts β Add new prompt
- Switch AI models: Setup Guide β Download different models
- Configure auto-actions: Settings β Enable auto-paste/auto-enter
Parakeet (default, recommended)
- Fast transcription using Apple Neural Engine
- 25 European languages supported
- Optimized for Apple Silicon
WhisperKit (alternative)
- OpenAI Whisper Small (216MB) - Fast, good quality
- OpenAI Whisper Large v3 Turbo (632MB) - Best balance
- Distil Whisper Large v3 Turbo (600MB) - Optimized speed
- OpenAI Whisper Large v2 Turbo (955MB) - Maximum accuracy
- Gemma 2 (2B/9B) - Google's efficient models
- Llama 3/3.2 (3B/8B) - Meta's powerful models
- Qwen 2.5/3 (1.5B-8B) - Alibaba's multilingual models
- Mistral 7B - High-quality French company model
- Phi 3.5 Mini - Microsoft's compact model
- DeepSeek R1 - Advanced reasoning model
All models run locally using MLX for Apple Silicon optimization.
WhisperClip is designed with privacy as the cornerstone:
- Local Processing Only: All voice recognition and AI processing happens on your device
- No Network Requests: Except for downloading models from Hugging Face
- No Analytics: No usage tracking, no telemetry, no data collection
- Open Source: Full transparency - inspect the code yourself
- Sandboxed: Runs in Apple's secure app sandbox
- Encrypted Storage: AI models stored securely on device
Sources/
βββ WhisperClip.swift # Main app entry point
βββ ContentView.swift # Main UI with sidebar navigation
βββ MicrophoneView.swift # Voice recording interface
βββ FileTranscriptionView.swift # Audio file transcription
βββ HistoryView.swift # Transcription history browser
βββ SharedViews.swift # Shared UI components
βββ AudioRecorder.swift # Voice recording logic
βββ VoiceToText*.swift # Transcription engines (Parakeet, WhisperKit)
βββ LLM*.swift # AI text enhancement
βββ TranscriptionHistory.swift # History data management
βββ ModelStorage.swift # Model management
βββ SettingsStore.swift # User preferences
βββ HotkeyManager.swift # Global shortcuts
βββ MeetingSession.swift # Meeting lifecycle orchestration
βββ MeetingRecorder.swift # Dual-channel audio capture & transcription
βββ MeetingAI.swift # AI summaries, Q&A, and analysis
βββ MeetingDetector.swift # Meeting app auto-detection
βββ MeetingStorage.swift # Meeting notes persistence
βββ MeetingModels.swift # Meeting data models
βββ MeetingNotesView.swift # Meeting list & live recording UI
βββ MeetingDetailView.swift # Meeting detail with tabs
βββ MeetingWaveformView.swift # Audio waveform visualization
- FluidAudio: Parakeet speech recognition with Apple Neural Engine
- WhisperKit: Apple's optimized Whisper implementation
- MLX: Apple Silicon ML framework
- MLX-Swift-Examples: LLM implementations
- Hub: Hugging Face model downloads
# Debug build
./local_build.sh Debug
# Release build with code signing
./build.sh
# Notarization (requires Apple Developer account)
./notarize.shWe welcome contributions! Please see our contributing guidelines:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Commit your changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open a Pull Request
- New AI model integrations
- UI/UX improvements
- Performance optimizations
- Language support
- Accessibility features
- Documentation improvements
WhisperClip is licensed under the MIT License - see the LICENSE file for details.
This means you can:
- β Use commercially
- β Modify and distribute
- β Use privately
- β Fork and create derivatives
Attribution required: Please include the original license notice.
WhisperClip is developed by Cydanix LLC.
- Website: whisperclip.com
- Support: support@cydanix.com
- Version: 1.0.50
- Apple - WhisperKit and MLX frameworks
- Senstella - FluidAudio and Parakeet models
- OpenAI - Original Whisper models
- Hugging Face - Model hosting and Hub library
- ML Community - Open source AI models (Gemma, Llama, Qwen, etc.)
Made with β€οΈ for privacy-conscious users
β Star this repo if you find it useful!