GH#14044: tighten voice-ai-models.md (147→126 lines) by alex-solovyev · Pull Request #14092 · marcusquinn/aidevops

alex-solovyev · 2026-03-30T10:36:42Z

Summary

Deduplicate selection guidance: remove 4 redundant "Pick:" lines and the "Selection by Priority" table — both fully covered by the Decision Flow tree
Move Decision Flow to top of doc (primacy effect — most actionable content first)
Compress NVIDIA Riva table: remove redundant "Role" column, shorten headers
Preserve Bark expressiveness note (laughter/music) in table cell

Details

Classification: Reference corpus (model selection reference). At 126 lines, well under the 300-line split threshold — tightening, not splitting.

What was removed (and why safe):

4 "Pick:" lines after each model table — every recommendation exists in the Decision Flow tree
"Selection by Priority" table (6 rows × 4 cols) — every cell's recommendation is in the Decision Flow tree
"Role" column from Riva table — redundant with Component column (ASR=Speech-to-text, TTS=Text-to-speech)

Zero knowledge loss: All 28 models, all specs, all cross-references, GPU planning, decision paths, and the Bark expressiveness note retained.

Runtime Testing

Risk: Low (docs/agent prompts only)
Level: self-assessed
Markdown lint: 0 errors

Closes #14044

aidevops.sh v3.5.455 plugin for OpenCode v1.3.7 with claude-opus-4-6 spent 2m and 7,249 tokens on this as a headless worker.

Summary by CodeRabbit

Documentation
- Introduced a decision framework for selecting between text-to-speech, speech-to-text, and conversational voice models based on cloning, latency, offline needs, accuracy, cost, and deployment requirements.
- Reorganized voice model comparison tables for improved clarity and navigation.
- Enhanced voice synthesis capability descriptions.

…ion guidance Remove 4 Pick: lines and Selection by Priority table, both fully redundant with the Decision Flow tree. Move Decision Flow to top (primacy effect). Compress Riva table (remove redundant Role column, shorten headers). Preserve Bark expressiveness note in table. Zero knowledge loss: all 28 models, all specs, all cross-references, GPU planning, and decision paths retained.

coderabbitai · 2026-03-30T10:36:49Z

Warning

Rate limit exceeded

@alex-solovyev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 22 minutes and 23 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 22 minutes and 23 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8306d9c9-cc7e-4282-b425-193555983dac

📥 Commits

Reviewing files that changed from the base of the PR and between b473f31 and 25eed29.

📒 Files selected for processing (1)

.agents/tools/voice/voice-ai-models.md

Walkthrough

Documentation restructuring of the Voice AI Models agent guide. A new Decision Flow section was added to route users to specific models based on use case (TTS, STT, or conversational S2S), while prior recommendation guidance was consolidated and the NVIDIA Riva Composable Pipelines section was simplified.

Changes

Cohort / File(s)	Summary
Voice AI Models Documentation `.agents/tools/voice/voice-ai-models.md`	Added Decision Flow routing section for TTS/STT/conversational S2S selection; removed prior "Pick:" recommendation sentences; simplified NVIDIA Riva Composable Pipelines table structure and condensed pipeline notes; clarified Bark (Suno) expressive capabilities (laughter/music).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

GH#13992: tighten voice-ai-models.md (154→147 lines) #14038 — Overlapping content and formatting changes to the same Voice AI Models documentation file, including decision flow restructuring and GPU/VRAM terminology adjustments.

Poem

📋 The voice doc takes flight,
Decision flows shining bright,
Riva trimmed, Bark sings clear—
Simpler guidance, no more fear! 🎵

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Linked Issues check	❓ Inconclusive	The PR meets the core requirement from `#14044` for reference corpora: zero content loss across all 28 models, specs, decision paths, and Bark expressiveness note retained; however, the approach compresses content rather than extracting into chapter files as the issue guidance recommends.	Confirm whether content compression aligns with the 'reference corpora' strategy in `#14044`, which explicitly recommends chapter extraction and slim index (not inline compression) to preserve institutional knowledge structure.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly describes the main change: tightening documentation with specific metrics (147→126 lines), directly related to the changeset's line reduction and content restructuring.
Out of Scope Changes check	✅ Passed	All changes are within scope of `#14044` voice-ai-models.md simplification: removal of redundant 'Pick:' lines, Decision Flow relocation, NVIDIA Riva table compression, and Bark expressiveness note clarification.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch chore/GH-14044-tighten-voice-ai-models

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-30T10:37:17Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

SonarCloud: 0 bugs, 0 vulnerabilities, 1 code smells

Mon Mar 30 10:37:12 UTC 2026: Code review monitoring started
Mon Mar 30 10:37:13 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 1

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 1
VULNERABILITIES: 0

Generated on: Mon Mar 30 10:37:15 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

gemini-code-assist

Code Review

This pull request reorganizes the voice-ai-models.md documentation by moving the 'Decision Flow' section to the top for better visibility and removing several 'Pick' summary lines to reduce redundancy. While the reorganization improves the flow, feedback highlights that removing the summary for Local STT results in the loss of specific technical details—such as language support for Parakeet models and OS dependencies for Apple Speech—that are not currently reflected in the remaining tables.

gemini-code-assist · 2026-03-30T10:38:37Z

.agents/tools/voice/voice-ai-models.md

 | NVIDIA Parakeet V3 | 0.6B | 9.6 | Fastest | 2GB |
 | Apple Speech | Built-in | 9.0 | Fast | On-device |

-Pick: Large v3 Turbo → best balance. Parakeet V3 → multilingual speed (25 langs). Parakeet V2 → English-only. Apple Speech → zero-setup macOS 26+.


Removing this line causes a loss of important information about language support for Parakeet V2 (English-only) and Parakeet V3 (multilingual), as well as the OS dependency for Apple Speech. This information is not present in the local STT table above or in the new Decision Flow. To adhere to the goal of 'Zero knowledge loss' and the project's practice of maintaining detailed explanations for key technical components, please consider adding this information to the local STT table before removing this summary.

References

Restore detailed explanations for key concepts and technical details to ensure clarity and prevent knowledge loss.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

.agents/tools/voice/voice-ai-models.md (1)

63-63: Line 63: Replace ambiguous “stale” with a dated status note.

stale without a date/source is hard to operationalize. Prefer a dated qualifier (e.g., “VRAM estimate, last verified 2026-03”) or move it to a short “Notes” phrase.

Suggested doc tweak

-| Bark (Suno) | 1.0B | MIT | 13+ | Yes (prompt) | 6GB (stale, expressive: laughter/music) |
+| Bark (Suno) | 1.0B | MIT | 13+ | Yes (prompt) | ~6GB (estimate; last verified 2026-03, expressive: laughter/music) |

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.agents/tools/voice/voice-ai-models.md at line 63, Update the table row for
"Bark (Suno)" to remove the ambiguous "stale" qualifier and replace it with a
dated status note or move it to a Notes column; for example change "6GB (stale,
expressive: laughter/music)" to "6GB (VRAM estimate, last verified 2026-03;
expressive: laughter/music)" or relocate "VRAM estimate, last verified 2026-03"
into a Notes field—edit the line containing "Bark (Suno)" in voice-ai-models.md
accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/tools/voice/voice-ai-models.md:
- Around line 110-120: The Pipeline table and the inline "Pipeline: `Audio ->
[Parakeet ASR] -> [Any LLM] -> [Magpie TTS] -> Audio`" / cascaded S2S note
should not be compressed in-place; instead extract this section into separate
chapter files for the reference-corpus strategy and replace it here with a slim
index entry linking to those new files. Concretely, create new chapter docs
(e.g., voice-models-parakeet.md, voice-pipeline-s2s.md) containing the full
table and pipeline details, update this file's block (the table and the
Pipeline/Cascaded S2S lines) to a short index summary pointing to those
chapters, and ensure filenames/classes referenced in nav or TOC reflect the new
chapter names so links resolve.

---

Nitpick comments:
In @.agents/tools/voice/voice-ai-models.md:
- Line 63: Update the table row for "Bark (Suno)" to remove the ambiguous
"stale" qualifier and replace it with a dated status note or move it to a Notes
column; for example change "6GB (stale, expressive: laughter/music)" to "6GB
(VRAM estimate, last verified 2026-03; expressive: laughter/music)" or relocate
"VRAM estimate, last verified 2026-03" into a Notes field—edit the line
containing "Bark (Suno)" in voice-ai-models.md accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5a518407-6aec-4260-9e47-e0f3fcab1d23

📥 Commits

Reviewing files that changed from the base of the PR and between 5eafefc and b473f31.

📒 Files selected for processing (1)

.agents/tools/voice/voice-ai-models.md

coderabbitai · 2026-03-30T10:40:36Z

.agents/tools/voice/voice-ai-models.md

+| Component | Model | Languages | NIM |
+|-----------|-------|-----------|-----|
+| ASR | Parakeet TDT 0.6B v2 | English | HF (research) |
+| ASR | Parakeet CTC 1.1B | English | Yes |
+| ASR | Parakeet RNNT 1.1B | 25 | Yes |
+| TTS | Magpie Multilingual | 17+ | Yes |
+| TTS | Magpie Zero-Shot | English+ | API |
+| Enhancement | StudioVoice | Any | Yes |
+| Translation | Riva Translate | 36 | Yes |
+
+Pipeline: `Audio -> [Parakeet ASR] -> [Any LLM] -> [Magpie TTS] -> Audio`. See `cloud-voice-agents.md`. Cascaded S2S (VAD+STT+LLM+TTS): see `speech-to-speech.md`.


⚠️ Potential issue | 🟠 Major

Line 110–120: This conflicts with the reference-corpus strategy from the linked issue.

The section is compressed in-place, but the issue objective for reference corpora asks for extraction into chapter files plus a slim index rather than content compression. Please align this section (and likely the doc structure) to that strategy before merge.

Proposed structural direction

-## NVIDIA Riva Composable Pipelines -| Component | Model | Languages | NIM | -... -Pipeline: `Audio -> [Parakeet ASR] -> [Any LLM] -> [Magpie TTS] -> Audio`. See `cloud-voice-agents.md`. Cascaded S2S (VAD+STT+LLM+TTS): see `speech-to-speech.md`. +## NVIDIA Riva Composable Pipelines +High-level index only. Detailed matrix moved to `tools/voice/voice-ai-models-riva.md`. +Pipeline overview: `Audio -> [Parakeet ASR] -> [Any LLM] -> [Magpie TTS] -> Audio`. +See: +- `tools/voice/voice-ai-models-riva.md` (full component matrix) +- `tools/voice/cloud-voice-agents.md` +- `tools/voice/speech-to-speech.md`

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.agents/tools/voice/voice-ai-models.md around lines 110 - 120, The Pipeline table and the inline "Pipeline: `Audio -> [Parakeet ASR] -> [Any LLM] -> [Magpie TTS] -> Audio`" / cascaded S2S note should not be compressed in-place; instead extract this section into separate chapter files for the reference-corpus strategy and replace it here with a slim index entry linking to those new files. Concretely, create new chapter docs (e.g., voice-models-parakeet.md, voice-pipeline-s2s.md) containing the full table and pipeline details, update this file's block (the table and the Pipeline/Cascaded S2S lines) to a short index summary pointing to those chapters, and ensure filenames/classes referenced in nav or TOC reflect the new chapter names so links resolve.

…T table Address Gemini review: the removed Pick line had details (Parakeet V3 25 langs, Parakeet V2 English-only, Apple Speech macOS 26+) not present in the table. Add these as parenthetical notes in the VRAM column.

github-actions · 2026-03-30T10:41:57Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

SonarCloud: 0 bugs, 0 vulnerabilities, 1 code smells

Mon Mar 30 10:41:53 UTC 2026: Code review monitoring started
Mon Mar 30 10:41:54 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 1

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 1
VULNERABILITIES: 0

Generated on: Mon Mar 30 10:41:56 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-03-30T10:42:16Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Nitpick about 'stale' wording: refers to unmaintained project status, not VRAM estimate staleness. Chapter file split: file is 126 lines, well under 300-line split threshold. Gemini's valid concern about lost Parakeet/Apple Speech details addressed in follow-up commit.

alex-solovyev added the origin:interactive Auto-created from TODO.md tag label Mar 30, 2026

gemini-code-assist bot reviewed Mar 30, 2026

View reviewed changes

coderabbitai bot previously requested changes Mar 30, 2026

View reviewed changes

alex-solovyev mentioned this pull request Mar 30, 2026

simplification: tighten agent doc Voice AI Models (.agents/tools/voice/voice-ai-models.md, 154 lines) #14044

Closed

marcusquinn mentioned this pull request Mar 30, 2026

[Supervisor:marcusquinn] 0 PRs, 23 assigned, 0 workers at 07:00 UTC #10944

Open

alex-solovyev mentioned this pull request Mar 30, 2026

[Supervisor:alex-solovyev] 20 PRs, 30 assigned, 3 workers at 14:31 UTC #14335

Open

marcusquinn merged commit 589ede9 into main Apr 1, 2026
16 checks passed

marcusquinn deleted the chore/GH-14044-tighten-voice-ai-models branch April 1, 2026 02:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH#14044: tighten voice-ai-models.md (147→126 lines)#14092

GH#14044: tighten voice-ai-models.md (147→126 lines)#14092
marcusquinn merged 2 commits intomainfrom
chore/GH-14044-tighten-voice-ai-models

alex-solovyev commented Mar 30, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 30, 2026 •

edited

Loading

Rate limit exceeded

❌ Failed checks (1 inconclusive)

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 30, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 30, 2026

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

sonarqubecloud bot commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alex-solovyev commented Mar 30, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Runtime Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

github-actions bot commented Mar 30, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 30, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Mar 30, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alex-solovyev commented Mar 30, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 30, 2026 •

edited

Loading