OpenVoice - AI-Powered Audio Enhancement

OpenVoice is a web application that removes background noise and enhances human voice in audio files using state-of-the-art AI technology powered by DeepFilterNet.

Features

Core Functionality

AI-Powered Audio Enhancement: Uses DeepFilterNet3 for professional-grade noise removal and voice enhancement
Simple Upload Interface: No authentication required - just upload and enhance
Real-time Progress Tracking: Monitor upload and processing progress with live updates
Queue Management: Handles multiple simultaneous users with position tracking
Privacy-First: Automatic file deletion after 10 minutes, no data persistence

Security

File size limits (configurable, default 50MB)
MIME type validation
Optional virus/malware scanning with ClamAV
Secure file handling

User Experience

Responsive, modern UI built with React and Tailwind CSS
Upload progress indicator
Processing queue position display
Estimated wait times
One-click download of enhanced audio
Manual file deletion option

Technology Stack

Backend

Framework: FastAPI
Audio Processing: DeepFilterNet (AI-based noise suppression)
Task Queue: Celery with Redis
File Handling: Python libraries (soundfile, librosa, pydub)
Security: python-magic for MIME validation, optional ClamAV integration

Frontend

Framework: React 18
Build Tool: Vite
Styling: Tailwind CSS
Icons: Lucide React
HTTP Client: Axios

Infrastructure

Message Broker: Redis
Containerization: Docker & Docker Compose
Web Server: Nginx (for production)

Installation

Prerequisites

Python 3.10+
Node.js 18+
Redis
Docker & Docker Compose (for containerized deployment)

Method 1: Docker Compose (Recommended)

This is the easiest way to get started:

# Clone the repository
git clone https://github.com/yourusername/OpenVoice.git
cd OpenVoice

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

The application will be available at:

Frontend: http://localhost
Backend API: http://localhost:8000
API Documentation: http://localhost:8000/docs

Method 2: Local Development Setup

Backend Setup

Install Python dependencies:

cd backend
python -m venv venv

# Windows
venv\Scripts\activate

# Linux/Mac
source venv/bin/activate

pip install -r requirements.txt

Install system dependencies:

Windows:

Install FFmpeg from https://ffmpeg.org/download.html
Add to PATH

Linux (Ubuntu/Debian):

sudo apt-get update
sudo apt-get install libsndfile1 libmagic1 ffmpeg

macOS:

brew install libsndfile libmagic ffmpeg

Install Redis:

Windows:

Download from https://github.com/microsoftarchive/redis/releases
Or use WSL

Linux:

sudo apt-get install redis-server
sudo systemctl start redis

macOS:

brew install redis
brew services start redis

Configure environment:

cp .env.example .env
# Edit .env with your settings

Run the backend:

# Terminal 1: Start the API server
uvicorn main:app --reload --host 0.0.0.0 --port 8000

# Terminal 2: Start Celery worker
celery -A celery_worker.celery_app worker --loglevel=info --concurrency=2

# Terminal 3: Start Celery beat (for file cleanup)
celery -A celery_worker.celery_app beat --loglevel=info

Frontend Setup

Install dependencies:

cd frontend
npm install

Configure environment (optional):

# Create .env file
echo "VITE_API_URL=http://localhost:8000" > .env

Run the development server:

npm run dev

The frontend will be available at http://localhost:3000

Build for production:

npm run build

Configuration

Environment Variables

Backend configuration (.env):

# Server Configuration
HOST=0.0.0.0
PORT=8000
DEBUG=False

# File Upload Settings
MAX_FILE_SIZE_MB=50
ALLOWED_EXTENSIONS=mp3,wav,ogg,m4a,flac,aac,wma
UPLOAD_DIR=./uploads
PROCESSED_DIR=./processed

# File Retention (in minutes)
FILE_RETENTION_MINUTES=10

# Redis Configuration
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_DB=0

# Celery Configuration
CELERY_BROKER_URL=redis://localhost:6379/0
CELERY_RESULT_BACKEND=redis://localhost:6379/0
CELERY_WORKERS=2
CELERY_CONCURRENCY=2

# Security
ENABLE_VIRUS_SCAN=False
CLAMAV_HOST=localhost
CLAMAV_PORT=3310

# CORS
CORS_ORIGINS=http://localhost:3000,http://localhost:5173

# Model Configuration
DEEPFILTERNET_MODEL=DeepFilterNet3
AUDIO_SAMPLE_RATE=48000

Optional: ClamAV Virus Scanning

To enable virus scanning:

Install ClamAV:

Linux:

sudo apt-get install clamav clamav-daemon
sudo freshclam  # Update virus definitions
sudo systemctl start clamav-daemon

macOS:

brew install clamav
freshclam
clamd

Windows: Download from https://www.clamav.net/downloads

Enable in configuration:

ENABLE_VIRUS_SCAN=True
CLAMAV_HOST=localhost
CLAMAV_PORT=3310

API Documentation

Endpoints

Upload Audio File

POST /api/upload
Content-Type: multipart/form-data

Request:
- file: Audio file (max 50MB)

Response:
{
  "task_id": "abc123...",
  "filename": "audio.mp3",
  "file_size_mb": 5.2,
  "queue_position": 2,
  "estimated_wait_seconds": 60,
  "retention_minutes": 10
}

Get Task Status

GET /api/status/{task_id}

Response:
{
  "task_id": "abc123...",
  "status": "processing",  // queued, processing, completed, failed
  "progress": 75,
  "queue_position": 0,
  "download_ready": false
}

Download Enhanced Audio

GET /api/download/{task_id}

Response: Audio file (audio/wav)

Delete Files

DELETE /api/delete/{task_id}

Response:
{
  "success": true,
  "task_id": "abc123...",
  "deleted_files": ["file1.mp3", "file2.wav"]
}

Get Queue Status

GET /api/queue

Response:
{
  "active_tasks": 2,
  "pending_tasks": 5,
  "total_queue": 7
}

Health Check

GET /api/health

Response:
{
  "status": "healthy",
  "redis": "connected",
  "celery": "running",
  "max_file_size_mb": 50,
  "file_retention_minutes": 10
}

Architecture

┌─────────────┐
│   Browser   │
└──────┬──────┘
       │
       │ HTTP
       ▼
┌─────────────────┐
│  React Frontend │
│   (Vite + UI)   │
└──────┬──────────┘
       │
       │ REST API
       ▼
┌──────────────────┐      ┌─────────────┐
│  FastAPI Backend │◄────►│    Redis    │
│  (Upload/Status) │      │  (Broker)   │
└──────┬───────────┘      └──────┬──────┘
       │                         │
       │ Submit Task             │ Get Task
       ▼                         ▼
┌──────────────────┐      ┌─────────────┐
│  Celery Workers  │◄────►│   Storage   │
│  (DeepFilterNet) │      │  (Files)    │
└──────────────────┘      └─────────────┘
       │
       │ Scheduled
       ▼
┌──────────────────┐
│   Celery Beat    │
│ (File Cleanup)   │
└──────────────────┘

Usage

Upload an Audio File:
- Click "Click to upload" or drag and drop an audio file
- Supported formats: MP3, WAV, OGG, M4A, FLAC, AAC, WMA
- Maximum size: 50MB (configurable)
Wait for Processing:
- See your queue position if other users are being processed
- Monitor real-time progress updates
- Processing typically takes 30-60 seconds depending on file length
Download Enhanced Audio:
- Click "Download Enhanced Audio" when processing completes
- File is available for 10 minutes
Privacy:
- Files are automatically deleted after 10 minutes
- You can manually delete files anytime using "Delete Files" button
- No data is stored permanently

Performance

Processing Speed: ~1-2x real-time (60-second audio takes 30-60 seconds)
Concurrent Users: Supports multiple users with queue management
File Size: Optimized for files up to 50MB
Quality: 48kHz high-quality output

Troubleshooting

Backend won't start

Issue: ImportError for DeepFilterNet

# Solution: Reinstall dependencies
pip install --force-reinstall deepfilternet torch torchaudio

Issue: Redis connection failed

# Solution: Make sure Redis is running
redis-cli ping  # Should return PONG

# Start Redis if needed
# Linux: sudo systemctl start redis
# macOS: brew services start redis
# Windows: redis-server

Worker won't start

Issue: Celery worker connection error

# Solution: Check Redis connection and restart worker
celery -A celery_worker.celery_app worker --loglevel=debug

Frontend can't connect to backend

Issue: CORS errors

# Solution: Check CORS_ORIGINS in backend .env
CORS_ORIGINS=http://localhost:3000,http://localhost:5173

Processing fails

Issue: Audio processing errors

# Check logs
docker-compose logs worker

# Or if running locally
celery -A celery_worker.celery_app worker --loglevel=debug

Large file uploads fail

Issue: Nginx timeout or size limit

# Solution: Update nginx.conf
client_max_body_size 100M;
proxy_read_timeout 300s;

Development

Running Tests

OpenVoice includes comprehensive unit tests for both backend and frontend components.

Backend Tests

The backend uses pytest for testing. Tests cover:

File validation and handling
Audio processing utilities
API endpoints
Noise strength parameter mapping
Error handling

Setup:

cd backend

# Install test dependencies
pip install -r requirements-test.txt

Run all tests:

# Run all tests with verbose output
pytest

# Run with coverage report
pytest --cov=. --cov-report=html --cov-report=term-missing

# Run specific test file
pytest tests/test_file_handler.py

# Run specific test class
pytest tests/test_api.py::TestUploadEndpoint

# Run specific test function
pytest tests/test_api.py::TestUploadEndpoint::test_upload_valid_audio_file

# Run tests in parallel (faster)
pytest -n auto

Run tests with markers:

# Run only unit tests
pytest -m unit

# Run only integration tests
pytest -m integration

# Skip slow tests
pytest -m "not slow"

View coverage report:

# Generate HTML coverage report
pytest --cov=. --cov-report=html

# Open the report in your browser
# The report will be in htmlcov/index.html

Frontend Tests

The frontend uses Vitest and React Testing Library. Tests cover:

Component rendering
User interactions
File upload validation
Noise reduction slider functionality
API integration
State management

Setup:

cd frontend

# Install test dependencies (included in package.json)
npm install

Run all tests:

# Run tests in watch mode (for development)
npm test

# Run tests once (for CI/CD)
npm test -- --run

# Run with coverage
npm run test:coverage

# Run tests with UI (interactive)
npm run test:ui

Run specific tests:

# Run specific test file
npm test -- App.test.jsx

# Run tests matching pattern
npm test -- --grep "slider"

# Run in watch mode for specific file
npm test -- App.test.jsx --watch

View coverage report:

# Generate coverage report
npm run test:coverage

# Coverage report will be in coverage/index.html

Docker Tests

Run tests inside Docker containers:

Backend:

# Run tests in backend container
docker-compose exec backend pytest

# Run with coverage
docker-compose exec backend pytest --cov=. --cov-report=term-missing

Worker (same as backend):

docker-compose exec worker1 pytest

Continuous Integration

For CI/CD pipelines, use these commands:

Backend:

cd backend
pip install -r requirements.txt -r requirements-test.txt
pytest --cov=. --cov-report=xml --cov-report=term-missing

Frontend:

cd frontend
npm ci
npm run test:coverage

Test Structure

Backend Tests (backend/tests/):

tests/
├── __init__.py
├── conftest.py                 # Pytest fixtures and configuration
├── test_file_handler.py        # File validation tests
├── test_audio_processor.py     # Audio processing tests
└── test_api.py                 # API endpoint tests

Frontend Tests (frontend/src/):

src/
├── App.test.jsx                # Main component tests
├── api.test.js                 # API client tests
└── setupTests.js               # Test configuration

Writing New Tests

Backend Test Example:

import pytest
from utils.file_handler import FileHandler

def test_validate_file_extension():
    handler = FileHandler(max_size_bytes=50*1024*1024,
                         allowed_extensions={'mp3', 'wav'})

    is_valid, error = handler.validate_file_extension('audio.mp3')
    assert is_valid is True
    assert error is None

Frontend Test Example:

import { render, screen, fireEvent } from '@testing-library/react';
import App from './App';

it('updates noise strength when slider is moved', () => {
  render(<App />);

  const slider = screen.getByRole('slider');
  fireEvent.change(slider, { target: { value: '10' } });

  expect(slider).toHaveValue('10');
});

Code Structure

OpenVoice/
├── backend/
│   ├── main.py                 # FastAPI application
│   ├── celery_worker.py        # Celery tasks
│   ├── config.py               # Configuration
│   ├── requirements.txt        # Python dependencies
│   └── utils/
│       ├── audio_processor.py  # DeepFilterNet integration
│       ├── file_handler.py     # File validation
│       └── file_cleanup.py     # Automatic cleanup
├── frontend/
│   ├── src/
│   │   ├── App.jsx            # Main UI component
│   │   ├── api.js             # API client
│   │   └── main.jsx           # Entry point
│   ├── package.json
│   └── vite.config.js
├── docker-compose.yml          # Container orchestration
├── Dockerfile.backend          # Backend container
├── Dockerfile.frontend         # Frontend container
├── Dockerfile.worker           # Worker container
└── nginx.conf                  # Nginx configuration

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

DeepFilterNet: https://github.com/Rikorose/DeepFilterNet
FastAPI: https://fastapi.tiangolo.com/
React: https://react.dev/
Celery: https://docs.celeryq.dev/

Privacy Policy

OpenVoice is designed with privacy in mind:

No User Accounts: No registration or authentication required
No Data Storage: Files are automatically deleted after 10 minutes
No Tracking: We don't track or analyze user behavior
Secure Processing: All processing happens on secure servers
Local Deployment: You can self-host for complete control

Support

For issues, questions, or contributions:

GitHub Issues: https://github.com/fzheng/OpenVoice/issues

Built with ❤️ using DeepFilterNet AI

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.claude		.claude
backend		backend
frontend		frontend
tests		tests
.gitignore		.gitignore
BUGFIX_SUMMARY.md		BUGFIX_SUMMARY.md
Dockerfile.backend		Dockerfile.backend
Dockerfile.frontend		Dockerfile.frontend
Dockerfile.worker		Dockerfile.worker
LICENSE		LICENSE
Makefile		Makefile
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
QUICKSTART.md		QUICKSTART.md
QUICK_TEST_GUIDE.md		QUICK_TEST_GUIDE.md
README.md		README.md
RUN_TESTS.md		RUN_TESTS.md
TESTING.md		TESTING.md
TESTS_FINAL_SUMMARY.md		TESTS_FINAL_SUMMARY.md
TEST_COMMANDS.md		TEST_COMMANDS.md
TEST_RESULTS.md		TEST_RESULTS.md
docker-compose.yml		docker-compose.yml
nginx.conf		nginx.conf
package.json		package.json

License

fzheng/OpenVoice

Folders and files

Latest commit

History

Repository files navigation

OpenVoice - AI-Powered Audio Enhancement

Features

Core Functionality

Security

User Experience

Technology Stack

Backend

Frontend

Infrastructure

Installation

Prerequisites

Method 1: Docker Compose (Recommended)

Method 2: Local Development Setup

Backend Setup

Frontend Setup

Configuration

Environment Variables

Optional: ClamAV Virus Scanning

API Documentation

Endpoints

Upload Audio File

Get Task Status

Download Enhanced Audio

Delete Files

Get Queue Status

Health Check

Architecture

Usage

Performance

Troubleshooting

Backend won't start

Worker won't start

Frontend can't connect to backend

Processing fails

Large file uploads fail

Development

Running Tests

Backend Tests

Frontend Tests

Docker Tests

Continuous Integration

Test Structure

Writing New Tests

Code Structure

Contributing

License

Acknowledgments

Privacy Policy

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages