Skip to content

High-fidelity video pipeline combining Real-ESRGAN upscaling, RIFE interpolation, and Whisper captioning. Features OpenCV analysis, adaptive HLS packaging, and VMAF verification for automated, production-grade enhancement.

Notifications You must be signed in to change notification settings

PranavViswanathan/Lucera

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lucera: Advanced Video Processing Pipeline

Lucera is an automated, high-fidelity video processing pipeline engineered for content optimization and delivery. It orchestrates a modular workflow combining state-of-the-art deep learning models for enhancement—including Real-ESRGAN for super-resolution and RIFE for motion interpolation—with OpenAI's Whisper for centimeter-perfect captioning. The system features a robust OpenCV-based analysis engine for scene detection and quality metrics (blur, noise, motion), culminating in FFmpeg-driven adaptive bitrate HLS packaging. Integrity is verified via industry-standard VMAF perceptual quality assessment, all encapsulated within a Dockerized environment for reproducible deployment

Pipeline Architecture

The following diagram illustrates the data flow through the Lucera pipeline:

graph TD
    A[Input Video] --> B[Analysis Engine]
    A --> C[Caption Generator]
    A --> D[Enhancement Pipeline]
    
    subgraph "Enhancement Stages"
    D --> D1[Noise Detection]
    D1 --> D2[Denoising]
    D2 --> D3[Frame Interpolation]
    D3 --> D4[Upscaling Super Resolution]
    end
    
    D4 --> E{Quality Verification}
    E -->|Pass| F[HLS Packager]
    
    B --> G[Final Reporting]
    C --> G
    E -->|VMAF Scores| G
    F --> G
    
    
Loading

Features

Lucera provides a suite of advanced features categorized by their stage in the pipeline:

1. Video Analysis

  • Metadata Extraction: Detailed technical specifications (codec, bitrate, resolution).
  • Scene Detection: Intelligent identifying of scene changes.
  • Quality Metrics: Analysis of blur, noise, motion, and complexity levels.

2. AI Captioning

  • Whisper Integration: Uses OpenAI's Whisper model for high-accuracy speech-to-text.
  • Multi-Format Output: Generates both SRT (SubRip) and VTT (WebVTT) subtitle formats.
  • Audio Processing: Handles audio extraction and management automatically.

3. Video Enhancement

  • Upscaling: AI-based super-resolution to increase video clarity and resolution (4x scale).
  • Frame Interpolation: Increases frame rate (e.g., to 60fps) for smoother motion using RIFE/AI techniques.
  • Denoising: Adaptive noise reduction to clean up grainy footage while preserving edges.

4. Adaptive Packaging

  • HLS Generation: Creates HTTP Live Streaming (HLS) master and variant playlists.
  • Adaptive Bitrate: Generates multiple quality levels to ensure smooth playback across different network conditions.

5. Quality Assurance

  • VMAF Verification: Calculates VMAF (Video Multimethod Assessment Fusion), PSNR, and SSIM scores to objectively verify that the enhanced video maintains or improves visual quality compared to the original.

Getting Started

Prerequisites

  • Python 3.8+
  • FFmpeg: Must be installed and accessible in your system path.
  • External Binaries (Required for Enhancement):
    • rife-ncnn-vulkan: For frame interpolation.
    • realesrgan-ncnn-vulkan: For upscaling.
    • Note: If these are not found in your PATH, enhancement steps will be skipped.
  • CUDA (Optional): Recommended for faster AI processing.

Installation

  1. Clone the repository:

    git clone https://github.com/PranavViswanathan/Lucera.git
    cd Lucera
  2. Create and activate a virtual environment:

    python -m venv env
    source env/bin/activate  # On Windows: env\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt

Docker Usage

Docker Usage

Automated (Recommended)

We provide a shell script to automate the Docker workflow.

  1. Make executable (first time only):

    chmod +x lucera.sh
  2. Run a video:

    ./lucera.sh run path/to/video.mp4
  3. Other commands:

    • ./lucera.sh build: Rebuild the image manually.
    • ./lucera.sh stop: Stop a running process.
    • ./lucera.sh clean: Remove images to free space.

Manual

If you prefer running Docker commands yourself:

  1. Build:

    docker build -t lucera .
  2. Run:

    docker run -v $(pwd):/data lucera /data/video.mp4

Usage

To process a video locally (without Docker), run the src/main.py script:

python src/main.py /path/to/your/video.mp4 --combine /path/to/output_dir

Combine Feature

You can use the --combine flag to merge the enhanced video with the generated subtitles into a single file.

python src/main.py video.mp4 --combine ./final_results

This will create a combined_<video_name>.mp4 file in the specified directory, with soft subtitles embedded.

Outputs

The pipeline generates various artifacts organized in the src/utility_classes/ logic:

  • Captions: src/utility_classes/caption_results/ (SRT/VTT files)
  • Enhanced Video: src/utility_classes/video_enhancers/output/
  • Analysis Data: src/utility_classes/analysis_results/
  • Final Reports: src/utility_classes/complete_pipeline_results/

Directory Structure

Lucera/
├── sr/
│   ├── main.py                 # Entry point
│   └── utility_classes/        # Core modules
│       ├── video_analysis.py   # Analysis engine
│       ├── caption_generation.py # Whisper integration
│       ├── video_enchancers.py # Upscaling/Denoising/Interpolation
│       ├── packaging_generator.py # HLS packaging
│       └── VMAF.py             # Quality verification
├── env/                        # Virtual environment
└── README.md                   # Documentation

About

High-fidelity video pipeline combining Real-ESRGAN upscaling, RIFE interpolation, and Whisper captioning. Features OpenCV analysis, adaptive HLS packaging, and VMAF verification for automated, production-grade enhancement.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published