🎥 Video to Transcript Converter 🌐

A Flask web application that converts video files into translated transcripts using AI-powered speech recognition and translation.

✨ Features

🎤 Extract audio from MP4 videos
🔉 Convert audio to 16kHz WAV format (optimal for speech recognition)
🗣️ Transcribe audio using Groq's Whisper model
🌍 Translate transcripts to multiple languages using Gemma3 AI
💾 Download transcripts as JSON files
🎨 Modern, responsive UI with progress tracking

📋 Prerequisites

Before you begin, ensure you have:

Python 3.8+
Ollama running locally with Gemma3 model
Groq API key (for Whisper transcription)
FFmpeg installed (for audio processing)

🛠️ Installation

Clone the repository:

git clone https://github.com/yourusername/video-to-transcript.git
cd video-to-transcript

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install dependencies:
```
pip install -r requirements.txt
```

Set up grok api key:

set your grok api key in functions/convertedwav_to_transcript.py file

Download FFmpeg: Open your terminal and enter this command:
```
sudo apt-get install ffmpeg
```
Download the gemma3 model: First install ollama in your system then open your terminal and enter this command::
```
ollama pull gemma3:12b
```

🚀 Running the Application

Start the Flask server:
```
python app.py
```
Access the application: Open your browser and navigate to:
```
http://localhost:5000
```

🖥️ Usage

Upload an MP4 video file
Select target language for translation
Click "Process Video"
Wait for processing to complete
Download the JSON transcript file

📂 Project Structure

video-to-transcript/
├── app.py                # Main Flask application
├── main.py               # Core processing logic
├── functions/
│   ├── video_to_wav.py   # Video to WAV conversion
│   ├── wav_to_16kwav.py  # Audio format conversion
│   ├── convertedwav_to_transcript.py  # Speech recognition
│   └── transcript_lan_covert.py       # Translation
├── templates/
│   └── index.html        # Frontend interface
├── static/
│   ├── script.js         # Client-side JavaScript
│   └── style.css         # Styling
└── outputs/              # Generated transcripts

🌐 Supported Languages

The application supports translation to:

Bengali (default)
English
Hindi
Spanish
French
Portuguese
German
Russian
Italian
Dutch
Chinese (Simplified)
Japanese
Korean
Arabic

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a new branch (git checkout -b feature-branch)
Commit your changes (git commit -m 'Add new feature')
Push to the branch (git push origin feature-branch)
Open a Pull Request

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

⚠️ Troubleshooting

Audio processing fails: Ensure FFmpeg is installed and in your PATH
Translation errors: Verify Ollama is running and Gemma3 model is downloaded
API errors: Check your Groq API key in the functions/convertedwav_to_transcript.py file
File permission issues: Ensure the uploads and outputs directories are writable

📧 Contact

For support or questions, please contact sbose3739@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🎥 Video to Transcript Converter 🌐

✨ Features

📋 Prerequisites

🛠️ Installation

🚀 Running the Application

🖥️ Usage

📂 Project Structure

🌐 Supported Languages

🤝 Contributing

📜 License

⚠️ Troubleshooting

📧 Contact

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🎥 Video to Transcript Converter 🌐

✨ Features

📋 Prerequisites

🛠️ Installation

🚀 Running the Application

🖥️ Usage

📂 Project Structure

🌐 Supported Languages

🤝 Contributing

📜 License

⚠️ Troubleshooting

📧 Contact