Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .githooks/post-checkout
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ repo_root="$({
cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd
})"

"$repo_root/scripts/ensure_agent_hardlinks.sh" >/dev/null
"$repo_root/scripts/ensure_agent_hardlinks.sh" --quiet
2 changes: 1 addition & 1 deletion .githooks/post-merge
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ repo_root="$({
cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd
})"

"$repo_root/scripts/ensure_agent_hardlinks.sh" >/dev/null
"$repo_root/scripts/ensure_agent_hardlinks.sh" --quiet
2 changes: 1 addition & 1 deletion .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ See [CI Architecture](docs/dev/CI/architecture.md) for full details.
|------|--------|-----|
| `cargo fmt`, `cargo clippy` | GitHub-hosted | Fast, parallel, free |
| `cargo audit`, `cargo deny` | GitHub-hosted | Security checks, no build needed |
| `cargo build` | **Self-hosted** | Warm cache, THE build |
| `cargo build`, `cargo test --workspace` | GitHub-hosted | Fast cores; no hardware needed |
| Hardware tests | **Self-hosted** | Requires display/audio/clipboard |

### DON'T (Common AI Mistakes)
Expand Down
12 changes: 10 additions & 2 deletions .github/workflows/docs-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,16 @@ jobs:

- name: Validate agent instruction mirrors
run: |
diff -u AGENTS.md .github/copilot-instructions.md
diff -u AGENTS.md .kilocode/rules/agents.md
if ! diff -u AGENTS.md .github/copilot-instructions.md; then
echo "::error::Agent instruction mirror drift: .github/copilot-instructions.md differs from AGENTS.md"
echo "hint: run ./scripts/ensure_agent_hardlinks.sh (or mise run prepare)" >&2
exit 1
fi
if ! diff -u AGENTS.md .kilocode/rules/agents.md; then
echo "::error::Agent instruction mirror drift: .kilocode/rules/agents.md differs from AGENTS.md"
echo "hint: run ./scripts/ensure_agent_hardlinks.sh (or mise run prepare)" >&2
exit 1
fi

- name: Append revision log entries
run: |
Expand Down
2 changes: 1 addition & 1 deletion .kilocode/rules/agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ See [CI Architecture](docs/dev/CI/architecture.md) for full details.
|------|--------|-----|
| `cargo fmt`, `cargo clippy` | GitHub-hosted | Fast, parallel, free |
| `cargo audit`, `cargo deny` | GitHub-hosted | Security checks, no build needed |
| `cargo build` | **Self-hosted** | Warm cache, THE build |
| `cargo build`, `cargo test --workspace` | GitHub-hosted | Fast cores; no hardware needed |
| Hardware tests | **Self-hosted** | Requires display/audio/clipboard |

### DON'T (Common AI Mistakes)
Expand Down
2 changes: 1 addition & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ See [CI Architecture](docs/dev/CI/architecture.md) for full details.
|------|--------|-----|
| `cargo fmt`, `cargo clippy` | GitHub-hosted | Fast, parallel, free |
| `cargo audit`, `cargo deny` | GitHub-hosted | Security checks, no build needed |
| `cargo build` | **Self-hosted** | Warm cache, THE build |
| `cargo build`, `cargo test --workspace` | GitHub-hosted | Fast cores; no hardware needed |
| Hardware tests | **Self-hosted** | Requires display/audio/clipboard |

### DON'T (Common AI Mistakes)
Expand Down
107 changes: 10 additions & 97 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,116 +1,29 @@
# ColdVox
> ⚠️ **Internal Alpha** - This project is in early development and not ready for production use.

> **⚠️ CRITICAL**: Documentation is out of sync with code. Whisper STT has been removed; Parakeet doesn't compile. See [`docs/plans/critical-action-plan.md`](docs/plans/critical-action-plan.md) for current status. **Only Moonshine STT works** (requires `uv sync` first).
## Development
- Install Rust (stable) and required system dependencies for your platform.
- Use the provided scripts in `scripts/` to help with local environment setup.

### Developer Git Hooks

This project uses a "Zero-Latency" git hook standard powered by **[mise](https://mise.jdx.dev)** and **lint-staged**.

### Setup
1. **Install mise**: `curl https://mise.run | sh` (or see [docs](https://mise.jdx.dev/getting-started.html))
2. **Install dependencies**: `mise install`
3. **Activate hooks**: `mise run prepare` (runs automatically on `npm install`)

Hooks will now run automatically on `git commit`. To run manually:
```bash
mise run pre-commit
```

# ColdVox

> ⚠️ **Internal Alpha** - This project is in early development and not ready for production use.

Minimal root README. Full developer & architecture guide: see [`CLAUDE.md`](CLAUDE.md).

## Overview
ColdVox is a modular Rust workspace providing real‑time audio capture, VAD, STT (Faster-Whisper), and cross‑platform text injection.
> **⚠️ CRITICAL**: Documentation and feature status changes quickly. See [`docs/plans/critical-action-plan.md`](docs/plans/critical-action-plan.md) for what currently works.

## Quick Start
Minimal root README. Full developer & architecture guide: see [`CLAUDE.md`](CLAUDE.md). Assistants should read [`AGENTS.md`](AGENTS.md).

**For Voice Dictation (Recommended):**
```bash
# Run with default Faster-Whisper STT and text injection (model auto-discovered)
cargo run --features text-injection

# With specific microphone device
cargo run --features text-injection -- --device "HyperX QuadCast"

# TUI Dashboard with controls
cargo run --bin tui_dashboard --features tui
```

**Other Usage:**
```bash
# VAD-only mode (no speech recognition)
cargo run

# Test microphone setup
cargo run --bin mic_probe -- list-devices
```

> Audio dumps: The TUI dashboard now records raw audio to `logs/audio_dumps/` by default. Pass `--dump-audio=false` to disable persistent capture.

**Note on Defaults**: Faster-Whisper STT is the default feature (enabled automatically), ensuring real speech recognition in the app and tests. This prevents fallback to the mock plugin, which skips transcription. Override with `--stt-preferred mock` or env `COLDVOX_STT_PREFERRED=mock` if needed for testing. For other STT backends, enable their features and set preferred accordingly.

### Configuration (Canonical Path)
- Canonical STT selection config lives at `config/plugins.json`.
- Any legacy duplicates like `./plugins.json` or `crates/app/plugins.json` are deprecated and ignored at runtime. A warning is logged on startup if they exist. Please migrate changes into `config/plugins.json` only.
- Some defaults can also be set in `config/default.toml`, but `config/plugins.json` is the source of truth for STT plugin selection.

### Whisper Model Setup
- **Python Package**: Install the `faster-whisper` Python package via pip
- **Models**: Whisper models are automatically downloaded on first use
- **Model Identifiers**: Use standard Whisper model names (e.g., "tiny.en", "base.en", "small.en", "medium.en")
- **Manual Path**: Set `WHISPER_MODEL_PATH` to specify a model identifier or custom model directory
- **Common Models**:
- "tiny.en" (~39MB) - Fastest, lower accuracy
- "base.en" (~142MB) - Good balance of speed and accuracy
- "small.en" (~466MB) - Better accuracy
- "medium.en" (~1.5GB) - High accuracy

## How It Works
1. **Always-on pipeline**: Audio capture, VAD, STT, and text-injection buffering run continuously by default. Raw 16 kHz mono audio is recorded to `logs/audio_dumps/` for later review.
2. **Voice activation (default)**: The Silero VAD segments speech automatically—no hotkey required.
3. **Push-to-talk (preview inject)**: Hold `Super+Ctrl` to stream buffered text into the preview/injection window when you need manual control. Release to stop feeding new text.

More detail: See [`CLAUDE.md`](CLAUDE.md) for full developer guide.

### Python 3.13 and PyO3
If your system default Python is 3.13, current `pyo3` versions may warn about unsupported Python version during build. Two options:
## Development

1) Prefer Python 3.12 for development tools, or
2) Build using the stable Python ABI by exporting:
### Developer Git Hooks

```bash
set -gx PYO3_USE_ABI3_FORWARD_COMPATIBILITY 1 # fish shell
cargo check
```
This project uses a git hook standard powered by **[mise](https://mise.jdx.dev)** and **lint-staged**.

We plan to upgrade `pyo3` in a follow-up to remove this requirement.
1. Install mise: `curl https://mise.run | sh` (or see [docs](https://mise.jdx.dev/getting-started.html))
2. Install toolchain: `mise install`
3. Activate hooks + agent mirrors: `mise run prepare`

### Future Vision (Experimental)
- We're actively exploring an **always-on intelligent listening** architecture that keeps a lightweight listener running continuously and spins up tiered STT engines on demand.
- This speculative work includes decoupled listening/processing threads, dynamic STT memory management, and context-aware activation.
- Read the full experimental plan in [`docs/architecture.md`](docs/architecture.md#coldvox-future-vision). Treat it as research guidance—not a committed roadmap.
To run the hook pipeline manually:

## Slow / Environment-Sensitive Tests
Some end‑to‑end tests exercise real injection & STT. Gate them locally by setting an env variable (planned):
```bash
export COLDVOX_SLOW_TESTS=1
cargo test -- --ignored
mise run pre-commit
```
Headless behavior notes: see [`docs/text_injection_headless.md`](docs/text_injection_headless.md).

## License
Dual-licensed under MIT or Apache-2.0. See `LICENSE-MIT` and `LICENSE-APACHE` if present, else crate-level manifests.

## Contributing

- Review the [Master Documentation Playbook](docs/MasterDocumentationPlaybook.md).
- Follow the repository [Documentation Standards](docs/standards.md).
- Coordinate work through the [Documentation Todo Backlog](docs/todo.md).
- Assistants should read [`AGENTS.md`](AGENTS.md).
6 changes: 3 additions & 3 deletions docs/MasterDocumentationPlaybook.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,13 +93,13 @@ The following files configure AI coding agents and MUST live at standard locatio

- `AGENTS.md` (root): Canonical agent instructions following the [AGENTS.md standard](https://agents.md/). This is the single source of truth for all AI agents.
- `CLAUDE.md` (root): Claude Code configuration. Should import from or reference `AGENTS.md`.
- `.github/copilot-instructions.md`: GitHub Copilot instructions. Symlink to `AGENTS.md`.
- `.kilocode/rules/agents.md`: Kilo Code rules. Symlink to `../../AGENTS.md`.
- `.github/copilot-instructions.md`: GitHub Copilot instructions. Must match `AGENTS.md` (locally hardlinked where possible).
- `.kilocode/rules/agents.md`: Kilo Code rules. Must match `AGENTS.md` (locally hardlinked where possible).
- `.gemini/settings.json`: Gemini CLI configuration. Set `"contextFileName": "AGENTS.md"` to use root AGENTS.md.
- `.cursorrules` (root, optional): Cursor-specific rules if needed beyond `AGENTS.md`.
- `.builderrules` (root, optional): Builder.io-specific rules if needed.

**Hierarchy**: `AGENTS.md` is authoritative. Tool-specific files should either symlink to it or contain only tool-specific overrides that reference `AGENTS.md`.
**Hierarchy**: `AGENTS.md` is authoritative. Tool-specific files should either be hardlinked mirrors (preferred) or contain only tool-specific overrides that reference `AGENTS.md`.

**Do NOT create** `docs/agents.md` - agent configuration lives at the root for tool discovery.

Expand Down
73 changes: 34 additions & 39 deletions docs/dev/CI/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,52 +6,43 @@

ColdVox CI splits workloads between GitHub-hosted and self-hosted runners based on one question:

**Does this task require the physical laptop's hardware?**
**Does this task require the physical laptop's hardware (display, audio, clipboard)?**

| Requires Laptop? | Task | Runner |
|------------------|------|--------|
| No | `cargo fmt --check` | GitHub-hosted |
| No | `cargo clippy` | GitHub-hosted |
| No | `cargo audit`, `cargo deny` | GitHub-hosted |
| **Yes** | `cargo build` (warm cache) | Self-hosted |
| No | `cargo build` | GitHub-hosted |
| No | `cargo test --workspace` (unit tests) | GitHub-hosted |
| **Yes** | Hardware tests (display, audio, clipboard) | Self-hosted |

---

## Why Split?

### 1. CPU Dedication
### 1. Hardware Isolation

If the laptop runs lint, build, AND tests sequentially, they compete for CPU.
The self-hosted runner is a laptop with **weak hardware but a live display**. GitHub-hosted runners have **powerful hardware but no display**.

With the split:
- **Laptop**: 100% CPU on building + hardware tests
- **GitHub**: Handles lint/security on their infrastructure (free)
- **Laptop**: Only runs tests that need real display/audio/clipboard
- **GitHub**: Handles everything else (lint, security, build, unit tests)

### 2. No Redundant Builds

| Bad Pattern | Good Pattern |
|-------------|--------------|
| GitHub: `cargo build` (discarded) | GitHub: `cargo clippy` (type checks only) |
| Self-hosted: `cargo build` (again) | Self-hosted: `cargo build` (THE build) |

`clippy` does full type checking without generating binaries. Same error detection, no wasted compilation.

### 3. Parallelism
### 2. Parallelism

GitHub-hosted jobs run in parallel on separate machines. Self-hosted queues on one laptop.

```
Push PR:
GitHub: [lint] [security] [docs] ← All parallel, 2 min each
Self-hosted: [build + hardware tests] ← Starts immediately, 8-12 min
GitHub: [lint] [security] [docs] [build+unit-tests] ← All parallel
Self-hosted: [hardware tests] ← Only hardware-dependent tests

Total time: ~12 min (not 2 + 2 + 2 + 12 = 18 min)
Total time: max(GitHub jobs, hardware tests)
```

### 4. No Waiting
### 3. No Wasted Work

Self-hosted has **no `needs:` dependency**. It starts immediately in parallel with GitHub-hosted jobs.
The laptop does minimal work - just the tests that *require* hardware access.

---

Expand All @@ -74,8 +65,8 @@ Self-hosted has **no `needs:` dependency**. It starts immediately in parallel wi
| `GabrielBB/xvfb-action` | Internally calls `apt-get` (doesn't exist) |
| `sudo apt-get install` | Wrong package manager |
| `DISPLAY=:99` | Conflicts with real display (`:0`) |
| `needs: [lint, build]` | Delays self-hosted start by 5-10 min |
| `cargo build` on GitHub-hosted | Wasted work (can't share artifacts with Fedora) |
| Running builds on self-hosted | Weak hardware; GitHub-hosted is faster |
| Running unit tests on self-hosted | Wastes limited resources |

---

Expand All @@ -84,41 +75,45 @@ Self-hosted has **no `needs:` dependency**. It starts immediately in parallel wi
```
┌─────────────────────────────────────────────────────────────────┐
│ GITHUB-HOSTED (ubuntu-latest) │
Parallel, free, NO BUILD artifacts
│ Parallel, powerful, handles most work
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ lint │ │ security │ │ docs │ │
│ │ │ │ │ │ (optional) │ │
│ │ │ │ │ │ │ │
│ │ fmt --check │ │ cargo audit │ │ cargo doc │ │
│ │ clippy │ │ cargo deny │ │ │ │
│ │ │ │ │ │ │ │
│ │ ~2 min │ │ ~2 min │ │ ~2 min │ │
│ │ NO BUILD │ │ NO BUILD │ │ │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ build_and_unit_tests │ │
│ │ cargo check → cargo build → cargo test --workspace │ │
│ │ ~10-15 min │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
║ (parallel, no waiting)
║ (parallel, no waiting)
┌─────────────────────────────────────────────────────────────────┐
│ SELF-HOSTED (Fedora/Nobara) │
Live KDE Plasma - THE build, THE tests
Weak hardware BUT has live KDE Plasma display
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ hardware │ │
│ │ hardware_tests │ │
│ │ │ │
│ │ Environment: │ │
│ │ • DISPLAY=$DISPLAY (from session, NOT :99) │ │
│ │ • WAYLAND_DISPLAY=$WAYLAND_DISPLAY │ │
│ │ • DISPLAY=:0 (live session, NOT :99) │ │
│ │ • WAYLAND_DISPLAY=wayland-0 │ │
│ │ • Real audio, real clipboard │ │
│ │ │ │
│ │ Steps: │ │
│ │ 1. cargo build (incremental, sccache, mold) → 2-3 min │ │
│ │ 2. Hardware tests (injection, audio) → 5-8 min │ │
│ │ Tests: │ │
│ │ • real-injection-tests (xdotool, ydotool, clipboard) │ │
│ │ • hardware_check (audio capture, display access) │ │
│ │ │ │
│ │ Total: ~8-12 min │ │
│ │ Total: ~5-10 min (minimal work!) │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
Expand Down
Loading
Loading