-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Description
Problem
Running desktop hotkey dictation and API STT as separate daemons increases operational complexity and failure modes.
Proposal
Add a service mode to voxtype daemon that starts a local OpenAI-compatible transcription API in parallel with the existing daemon loop.
Goals
- Single daemon process for:
- existing hotkey-driven dictation
- local API transcription
- OpenAI-compatible endpoint:
POST /v1/audio/transcriptions
- Health endpoint:
GET /healthz
- Loopback-only by default (
127.0.0.1) - No built-in auth in v1 (assume trusted localhost)
Language behavior
- General language support remains configurable.
- Default constrained auto set for local deployment/tests:
de,en.
Non-goals (v1)
- Public network exposure
- Built-in auth/mTLS
- TTS or chat endpoints
Why
This enables a clean architecture where upstream voxtype is the single local speech daemon, while another service (for example Tabura) handles external auth/proxy and forwards requests to localhost.
Acceptance criteria
voxtype daemoncan run hotkey flow and API flow concurrently.- API transcriptions work while daemon is active.
- Existing daemon behavior remains unchanged when service mode is disabled.
- Integration tests cover multipart transcription endpoint and concurrent operation.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels