Skip to content

Jitpomi/carbon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

20 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŽค Carbon Voice Assistant

A premium, real-time voice assistant built with Rust, featuring advanced AI-powered speech recognition, natural conversation flow, and a stunning modern interface.

Rust Dioxus Whisper AI License

โœจ Features

๐ŸŽฏ Core Voice Capabilities

  • ๐ŸŽค Advanced Voice Activity Detection - Professional-grade VAD with real-time probability scoring
  • ๐Ÿ—ฃ๏ธ Real-time Speech Transcription - Powered by Whisper AI with streaming text updates
  • โšก Pre-initialized Models - Zero-delay voice sessions with background model loading
  • ๐Ÿ”‡ Smart Mute Controls - Granular mute/unmute without ending sessions
  • ๐Ÿ›‘ Session Management - Clean start/stop controls with proper state handling

๐ŸŽจ Premium User Interface

  • ๐ŸŒŠ Dynamic Audio Visualization - 15-bar spectrum analyzer with realistic wave patterns
  • ๐Ÿ’ซ Responsive Voice Orb - Scales and glows based on voice intensity
  • ๐Ÿ”ฎ Pulsing Ring Effects - Elegant animations that respond to speech activity
  • ๐ŸŽญ Natural State Transitions - Smooth "Ready" โ†’ "Listening" โ†’ "Processing" flow
  • ๐Ÿ“ฑ Mobile-Optimized - Touch-friendly interface with haptic feedback prevention

๐Ÿ’ฌ Conversation Experience

  • ๐Ÿ“œ Chat History - Sliding conversation panel with message bubbles
  • ๐Ÿ”„ Multi-turn Conversations - Maintains context across voice sessions
  • โฑ๏ธ Automatic Pause Detection - Intelligent processing triggers after natural pauses
  • ๐ŸŽต Audio Feedback System - Subtle earcons for button interactions (configurable)
  • ๐ŸŒ™ Dark Theme - Premium glassmorphism design with backdrop blur effects

๐Ÿ”ง Technical Excellence

  • ๐Ÿฆ€ Pure Rust Implementation - Memory-safe, high-performance voice processing
  • ๐ŸŒ Cross-Platform Support - Desktop, web, and mobile-ready architecture
  • ๐ŸŽ Apple Silicon Optimization - Metal acceleration for M1/M2 Macs
  • ๐Ÿ”’ Privacy-First - All processing happens locally, no cloud dependencies
  • โšก Async Architecture - Non-blocking voice processing with Tokio runtime

๐Ÿ—๏ธ Architecture

carbon/
โ”œโ”€โ”€ ๐Ÿ“ฆ carbon-lib/              # Core voice processing engine
โ”‚   โ”œโ”€โ”€ ๐ŸŽฏ src/hooks.rs         # Voice detection & transcription hooks
โ”‚   โ”œโ”€โ”€ ๐ŸŽค src/vad.rs           # Voice activity detection algorithms
โ”‚   โ””โ”€โ”€ ๐Ÿง  src/transcription.rs # Whisper AI integration
โ”œโ”€โ”€ ๐Ÿ–ฅ๏ธ carbon-client/           # Modern web interface
โ”‚   โ”œโ”€โ”€ ๐ŸŽจ src/components/      # Dioxus UI components
โ”‚   โ”‚   โ”œโ”€โ”€ voice_interface.rs  # Main voice orb & controls
โ”‚   โ”‚   โ”œโ”€โ”€ conversation.rs     # Chat history panel
โ”‚   โ”‚   โ””โ”€โ”€ audio_visualizer.rs # Sound wave spectrum
โ”‚   โ””โ”€โ”€ ๐ŸŽญ assets/              # Styling & static resources
โ””โ”€โ”€ ๐Ÿ“š README.md

๐Ÿš€ Quick Start

Prerequisites

  • Rust 1.70+ with Cargo
  • Microphone permissions (browser will prompt)
  • Modern browser (Chrome, Firefox, Safari, Edge)

Installation & Setup

# Clone the repository
git clone https://github.com/your-username/carbon.git
cd carbon

# Build the workspace
cargo build --release

# Run the voice assistant
cd carbon-client
cargo run --release

First Launch

  1. ๐ŸŒ Open Browser - Navigate to http://localhost:8080
  2. โณ Wait for Initialization - Whisper model loads automatically ("Initializing...")
  3. ๐ŸŽค Grant Permissions - Allow microphone access when prompted
  4. โœ… Ready to Use - Interface shows "Ready to assist"

๐ŸŽฏ Usage Guide

Basic Voice Interaction

  1. ๐ŸŽค Start Listening - Click the microphone button

    • Interface changes to "Ready" (slate orb, minimal waves)
  2. ๐Ÿ—ฃ๏ธ Speak Naturally - Begin talking

    • Orb turns emerald and scales with voice intensity
    • 15-bar spectrum analyzer shows real-time audio
    • Pulsing rings appear during active speech
  3. โธ๏ธ Natural Pauses - Stop speaking for 2+ seconds

    • Automatically triggers "Processing..." state
    • Blue orb with gentle pulsing animation
  4. ๐Ÿ“ View Transcription - Check conversation history

    • Click chat bubble icon (bottom-right)
    • Sliding panel shows all transcribed text

Advanced Controls

  • ๐Ÿ”‡ Mute/Unmute - Toggle microphone without ending session

    • Muted: Red button with slashed microphone icon
    • Unmuted: Slate button with normal microphone icon
  • ๐Ÿ›‘ Stop Session - End voice monitoring completely

    • Red stop button returns to "Ready to assist" state
  • ๐Ÿ’ฌ Conversation Panel - Toggle chat history visibility

    • Floating button with smooth slide-up animation
    • Chat bubbles with timestamps and proper alignment

๐Ÿ› ๏ธ Development

Building Components

# Build entire workspace
cargo build

# Build with optimizations
cargo build --release

# Build specific component
cargo build -p carbon-lib
cargo build -p carbon-client

Running Tests

# Run all tests
cargo test

# Test specific component
cargo test -p carbon-lib

# Run with output
cargo test -- --nocapture

Development Mode

# Hot reload development server
cd carbon-client
cargo run

# With debug logging
RUST_LOG=debug cargo run

# Web target (experimental)
cargo run --features web

Platform-Specific Builds

# Desktop (default)
cargo run --features desktop

# Web assembly
cargo run --features web

# Mobile (iOS/Android)
cargo run --features mobile

๐Ÿ”ง Technical Stack

Core Technologies

  • ๐Ÿฆ€ Rust 2021 - Systems programming language
  • โšก Tokio - Async runtime for concurrent processing
  • ๐ŸŽค Kalosm - AI toolkit with Whisper integration
  • ๐Ÿง  Candle - Machine learning framework
  • ๐ŸŽต Rodio - Cross-platform audio library

Frontend Framework

  • ๐ŸŽจ Dioxus 0.7 - React-like UI framework for Rust
  • ๐Ÿ’จ Tailwind CSS - Utility-first styling
  • ๐ŸŒŠ CSS Animations - Smooth transitions and effects
  • ๐Ÿ“ฑ Responsive Design - Mobile-first approach

AI & Audio Processing

  • ๐Ÿ—ฃ๏ธ Whisper AI - OpenAI's speech recognition model
  • ๐ŸŽฏ Voice Activity Detection - Custom VAD algorithms
  • ๐Ÿ“Š Real-time Audio Analysis - Frequency spectrum visualization
  • ๐Ÿ”Š Web Audio API - Browser audio integration

Platform Support

  • ๐Ÿ–ฅ๏ธ Desktop - Native window with system integration
  • ๐ŸŒ Web - Browser-based with WebAssembly
  • ๐Ÿ“ฑ Mobile - iOS and Android support
  • ๐ŸŽ Apple Silicon - Metal GPU acceleration

๐ŸŽจ Design Philosophy

Natural Conversation Flow

  • ๐Ÿ‘๏ธ Ready State - Like making eye contact, available but not intrusive
  • ๐Ÿ‘‚ Listening State - Active attention when someone speaks
  • ๐Ÿง  Processing State - Thoughtful pause while understanding
  • ๐Ÿ—ฃ๏ธ Speaking State - Clear indication when assistant responds

Visual Communication

  • ๐ŸŽญ Pure Visual Feedback - No text clutter, intuitive animations
  • ๐ŸŒŠ Organic Animations - Natural scaling, breathing effects
  • ๐ŸŽจ Premium Aesthetics - Glassmorphism, gradients, subtle shadows
  • ๐Ÿ“ฑ Mobile-First - Touch-optimized with proper feedback

Performance & Privacy

  • โšก Zero-Delay Startup - Pre-initialized models
  • ๐Ÿ”’ Local Processing - No cloud dependencies
  • ๐ŸŽฏ Efficient Resource Usage - Optimized for battery life
  • ๐Ÿ›ก๏ธ Privacy by Design - Voice data never leaves device

๐Ÿ“Š Performance Metrics

  • ๐Ÿš€ Startup Time: < 2 seconds (model pre-loading)
  • โšก Voice Detection Latency: < 50ms
  • ๐ŸŽฏ Transcription Accuracy: 95%+ (Whisper SmallEn)
  • ๐Ÿ’พ Memory Usage: ~200MB (including models)
  • ๐Ÿ”‹ CPU Usage: < 5% idle, < 25% active transcription

๐Ÿค Contributing

We welcome contributions! Here's how to get started:

Development Setup

# Fork and clone
git clone https://github.com/your-username/carbon.git
cd carbon

# Create feature branch
git checkout -b feature/amazing-feature

# Make changes and test
cargo test
cargo clippy
cargo fmt

# Commit and push
git commit -m "Add amazing feature"
git push origin feature/amazing-feature

Code Style

  • Follow Rust conventions with cargo fmt
  • Run cargo clippy for linting
  • Add tests for new functionality
  • Update documentation as needed

Areas for Contribution

  • ๐ŸŒ Internationalization - Multi-language support
  • ๐ŸŽจ Themes - Custom color schemes and animations
  • ๐Ÿง  AI Integration - LLM response generation
  • ๐Ÿ“ฑ Mobile UX - Native mobile optimizations
  • ๐Ÿ”Š Audio Effects - Advanced audio processing

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • OpenAI for the incredible Whisper speech recognition model
  • Dioxus Labs for the amazing React-like Rust framework
  • Kalosm Team for the comprehensive AI toolkit
  • Rust Community for the robust ecosystem and support

Built with โค๏ธ and ๐Ÿฆ€ Rust

โญ Star this repo โ€ข ๐Ÿ› Report Bug โ€ข ๐Ÿ’ก Request Feature

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published