feat(conductor): add opt-in voice STT to Telegram bridge by Abeansits · Pull Request #309 · asheshgoplani/agent-deck

Abeansits · 2026-03-08T03:58:32Z

Summary

Opt-in voice STT: Telegram voice messages are transcribed locally using parakeet-mlx via a subprocess worker (stt_worker.py), crash-isolated from the bot event loop. No cloud API needed.
Disabled by default: Voice transcription only activates when BRIDGE_STT_ENABLED=true env var is set. Without it, voice messages are silently ignored.
No hardcoded paths: stt_worker.py uses shutil.which('parakeet-mlx') to find the CLI on PATH, with PARAKEET_CLI_PATH env var as an explicit override.
File logging: Bridge now logs to ~/.agent-deck/conductor/bridge.log in addition to stdout.

Changes

conductor/bridge.py: transcribe_voice() downloads voice files and calls stt_worker via async subprocess (60s timeout). handle_message() now handles message.voice when BRIDGE_STT_ENABLED=true.
conductor/stt_worker.py: New standalone STT worker that finds and invokes the parakeet-mlx CLI, reads output text files, and prints transcription to stdout.

Configuration

# Enable voice transcription
export BRIDGE_STT_ENABLED=true

# Optional: explicit path to parakeet-mlx CLI
export PARAKEET_CLI_PATH=/path/to/parakeet-mlx

Test plan

With BRIDGE_STT_ENABLED=true, send a voice message via Telegram and verify transcription
Verify voice messages produce a Transcribing... status then the transcribed text is forwarded to the conductor
Verify failed transcription returns [Could not transcribe voice message.]
With STT disabled (default), verify voice messages are silently ignored
Verify normal text message handling is unaffected

Replace Groq Whisper API with local parakeet-mlx (parakeet-tdt-0.6b-v3) for voice message transcription. Add TTS voice replies using macOS say + ffmpeg (OGG/Opus output), toggled via BRIDGE_TTS_ENABLED env var. - stt_worker.py: standalone subprocess worker that normalizes audio to mono 16kHz WAV and runs parakeet-mlx inference, crash-isolated from the bot event loop - bridge.py: transcribe_voice() calls stt_worker via async subprocess (60s timeout), generate_voice_reply() chains say + ffmpeg via async subprocesses with per-step timeouts and proper kill/cleanup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Switch from importing parakeet-mlx as a Python library to invoking the parakeet-mlx CLI binary. This avoids import/dependency issues and is cleaner for subprocess-based isolation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Strip generate_voice_reply(), BRIDGE_TTS_ENABLED/BRIDGE_TTS_VOICE config, bot.send_voice() TTS response block, say+ffmpeg pipeline, and FSInputFile import. Voice-to-text (STT) remains intact. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

asheshgoplani

Interesting voice transcription feature! A few issues to address:

Hardcoded ffmpeg path: The code uses a hardcoded path for ffmpeg instead of using shutil.which('ffmpeg') as described in the PR description. Please use dynamic path resolution so it works across different systems and installation methods (e.g., Homebrew on macOS puts it in /opt/homebrew/bin/, Linux distros vary).
Error handling: The voice transcription path needs proper error handling for common failure cases: ffmpeg not installed, microphone not available, transcription API timeout, invalid audio format. Users should get clear error messages rather than cryptic tracebacks.
Needs rebase: This PR is based on the stale v0.27.0 base. Main was rolled back to v0.26.4 after the Go 1.25 incident, so this needs a rebase onto current main.

Please fix the hardcoded path issue and add error handling, then rebase onto current main.

Abeansits and others added 3 commits March 7, 2026 16:10

asheshgoplani changed the title ~~feat(conductor): add local voice STT and TTS to Telegram bridge~~ feat(conductor): add opt-in voice STT to Telegram bridge Mar 17, 2026

asheshgoplani requested changes Apr 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(conductor): add opt-in voice STT to Telegram bridge#309

feat(conductor): add opt-in voice STT to Telegram bridge#309
Abeansits wants to merge 3 commits intoasheshgoplani:mainfrom
Abeansits:feat/voice-bridge-stt

Abeansits commented Mar 8, 2026 •

edited by asheshgoplani

Loading

Uh oh!

asheshgoplani left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Abeansits commented Mar 8, 2026 • edited by asheshgoplani Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Configuration

Test plan

Uh oh!

asheshgoplani left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Abeansits commented Mar 8, 2026 •

edited by asheshgoplani

Loading