Update README for modular SDK architecture and remove Kokoro TTS by Copilot · Pull Request #11 · Blaizzy/mlx-audio-swift

Copilot · 2026-01-31T19:39:51Z

The README documented the old monolithic example app structure and referenced Kokoro TTS, which is no longer part of the SDK. Updated to reflect the new modular architecture introduced in PR #1.

Changes

Removed

Kokoro TTS section and all references
Manual file copying instructions
nSpeak framework mentions
Marvis model documentation (superseded by modular architecture)

Added

Architecture: Documented 6 modules (Core, Codecs, TTS, STT, STS, UI)
Installation: Swift Package Manager integration with selective imports
Quick Start: Code examples for TTS, STT, and streaming generation
Supported Models: Table listing Soprano, Qwen3, LlamaTTS, GLMASR with HuggingFace repos
Advanced Usage: Custom parameters, codec usage, multi-voice selection
Requirements: Platform/SDK minimums and Apple Silicon recommendation

Example Usage

The new README guides developers to use the modular SDK:

// Import only what you need
import MLXAudioTTS
import MLXAudioCore

// Models download automatically from HuggingFace
let model = try await SopranoModel.fromPretrained("mlx-community/Soprano-80M-bf16")

// Generate with typed parameters
let audio = try await model.generate(
    text: "Hello from MLX Audio Swift!",
    parameters: GenerateParameters(
        maxTokens: 200,
        temperature: 0.7,
        topP: 0.95
    )
)

Original prompt

Summary

Update the README.md to match the new modular Swift SDK architecture introduced in PR #1 and remove all references to Kokoro TTS.

Changes Required

1. Remove Kokoro TTS References

Remove the entire "Kokoro" section (lines 14-22)

Remove Kokoro from the list of required files (line 8)

Remove the mention of "nSpeak framework is embedeed for Kokoro already" (line 12)

2. Replace with New SDK Documentation

Replace the old README content with comprehensive documentation for the new modular SDK:

Header Section

Change title to: # MLX Audio Swift

Add tagline: "A modular Swift SDK for audio processing with MLX on Apple Silicon"

Add badges for platforms (macOS 14+, iOS 17+)

Architecture Section

Add a section explaining the modular architecture:
## Architecture

MLXAudio follows a modular design allowing you to import only what you need:

- **MLXAudioCore**: Base types, protocols, and utilities
- **MLXAudioCodecs**: Audio codec implementations (SNAC, Vocos, Mimi)
- **MLXAudioTTS**: Text-to-Speech models (Soprano, Qwen3, LlamaTTS)
- **MLXAudioSTT**: Speech-to-Text models (GLMASR, Whisper)
- **MLXAudioSTS**: Speech-to-Speech (future)
- **MLXAudioUI**: SwiftUI components for audio interfaces
Installation Section

Add Swift Package Manager installation instructions:
dependencies: [
    .package(url: "https://github.com/Blaizzy/mlx-audio-swift.git", branch: "main")
]

// Import only what you need
.product(name: "MLXAudioTTS", package: "mlx-audio-swift"),
.product(name: "MLXAudioCore", package: "mlx-audio-swift")
Quick Start Section

Add example code for:

Text-to-Speech:
import MLXAudioTTS
import MLXAudioCore

// Load a TTS model from HuggingFace
let model = try await SopranoModel.fromPretrained("mlx-community/Soprano-80M-bf16")

// Generate audio
let audio = try await model.generate(
    text: "Hello from MLX Audio Swift!",
    parameters: GenerateParameters(
        maxTokens: 200,
        temperature: 0.7,
        topP: 0.95
    )
)

// Save to file
try saveAudioArray(audio, sampleRate: Double(model.sampleRate), to: outputURL)
Speech-to-Text:
import MLXAudioSTT
import MLXAudioCore

// Load audio file
let (sampleRate, audioData) = try loadAudioArray(from: audioURL)

// Load STT model
let model = try await GLMASRModel.fromPretrained("mlx-community/GLM-ASR-Nano-2512-4bit")

// Transcribe
let output = model.generate(audio: audioData)
print(output.text)
Streaming Generation:
for try await event in model.generateStream(text: text, parameters: parameters) {
    switch event {
    case .token(let token):
        print("Generated token: \(token)")
    case .audio(let audio):
        print("Final audio shape: \(audio.shape)")
    case .info(let info):
        print(info.summary)
    }
}
Supported Models Section

Add a table:
| Model | Type | HuggingFace Repo |
|-------|------|------------------|
| Soprano | TTS | mlx-community/Soprano-80M-bf16 |
| Qwen3 | TTS | mlx-community/VyvoTTS-EN-Beta-4bit |
| LlamaTTS (Orpheus) | TTS | mlx-community/orpheus-3b-0.1-ft-bf16 |
| GLMASR | STT | mlx-community/GLM-ASR-Nano-2512-4bit |
Features Section

Modular architecture for minimal app size

Automatic model downloading from HuggingFace

Native async/await support

Streaming audio generation

Type-safe Swift API

Optimized for Apple Silicon with MLX

Advanced Usage Section

Add examples for:

Custom generation parameters

Audio codec usage (SNAC encoding/decoding)

Voice selection for multi-voice models

Requirements

macOS 14+ or iOS 17+

Apple Silicon (M1 or later) recommended

Xcode 15+

Swift 5.9+

Examples Section

Reference the Examples/VoicesApp directory

Link to test files for more usage examples

Credits Section

Add credits:

Built on MLX Swift

Uses swift-transformers

Inspired by MLX Audio (Python)

License

Add license information if available

Implementation Notes

Use proper markdown formatting

Include code blocks with syntax highlighting

Keep the tone professional and developer-friendly

Ensure all code examples are accurate based on the actual API from PR MLX-Audio Swift SDK v1 #1

Remove ALL mentions of Kokoro TTS

This pull request was created from Copilot chat.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: Blaizzy <23445657+Blaizzy@users.noreply.github.com>

Blaizzy

LGTM!

ekryski · 2026-02-22T04:36:54Z

@Blaizzy any reason why you wanted to remove Kokoro? I've been finding it to still be on of the fastest and most accurate TTS models and I've been testing everything bleeding edge I can get my hands on.

I was planning on putting up a PR for it because I brought in a native implementation into one of my apps. But if you guys don't want to include it, then I won't bother creating a PR for it. I know there are some concerns about licensing because of some of the underlying libraries that it uses, but there is a way around that.

Initial plan

f4685bd

Copilot AI assigned Copilot and Blaizzy Jan 31, 2026

Copilot started work on behalf of Blaizzy January 31, 2026 19:40 View session

Update README.md with new modular SDK architecture

f099a81

Co-authored-by: Blaizzy <23445657+Blaizzy@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Update README.md for new modular Swift SDK architecture~~ Update README for modular SDK architecture and remove Kokoro TTS Jan 31, 2026

Copilot AI requested a review from Blaizzy January 31, 2026 19:44

Copilot finished work on behalf of Blaizzy January 31, 2026 19:44

Blaizzy approved these changes Jan 31, 2026

View reviewed changes

Blaizzy marked this pull request as ready for review January 31, 2026 19:46

Blaizzy merged commit 9b4b304 into main Jan 31, 2026

Blaizzy deleted the copilot/update-readme-sdk-architecture branch March 5, 2026 00:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update README for modular SDK architecture and remove Kokoro TTS#11

Update README for modular SDK architecture and remove Kokoro TTS#11
Blaizzy merged 2 commits intomainfrom
copilot/update-readme-sdk-architecture

Copilot AI commented Jan 31, 2026 •

edited

Loading

Uh oh!

Blaizzy left a comment

Uh oh!

ekryski commented Feb 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Copilot AI commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Removed

Added

Example Usage

Summary

Changes Required

1. Remove Kokoro TTS References

2. Replace with New SDK Documentation

Header Section

Architecture Section

Installation Section

Quick Start Section

Supported Models Section

Features Section

Advanced Usage Section

Requirements

Examples Section

Credits Section

License

Implementation Notes

Uh oh!

Blaizzy left a comment

Choose a reason for hiding this comment

Uh oh!

ekryski commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Jan 31, 2026 •

edited

Loading

ekryski commented Feb 22, 2026 •

edited

Loading