Skip to content

Prefer ogg/wav in external STT recording for backend compatibility#11528

Open
JumpLink wants to merge 1 commit intodanny-avila:mainfrom
faktenforum:feat/stt
Open

Prefer ogg/wav in external STT recording for backend compatibility#11528
JumpLink wants to merge 1 commit intodanny-avila:mainfrom
faktenforum:feat/stt

Conversation

@JumpLink
Copy link

When using external STT (e.g. Scaleway’s OpenAI-compatible Audio Transcriptions API), backends often only accept formats like ogg and wav, not webm/mp4. The client’s MediaRecorder was preferring audio/webm and audio/mp4 first, which led to “Invalid file format” (400) on those backends.

Change: In useSpeechToTextExternal.ts, the order in getBestSupportedMimeType() is updated so Scaleway-compatible formats are tried first: audio/ogg;codecs=opus, audio/ogg, audio/wav, then audio/webm, audio/webm;codecs=opus, audio/mp4. The logic still uses MediaRecorder.isTypeSupported(), so only formats the browser supports are used.

Testing: Verified with Scaleway STT (model whisper-large-v3) and Firefox; recording uses ogg and transcription succeeds. OpenAI Whisper also supports ogg/wav, so behavior for other compatible backends is unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant