This system uses Retrieval-Augmented Generation (RAG) to answer questions about educational video content. It processes MP4 videos, creates embeddings, and provides intelligent responses based on the video content.
-
OpenAI API Key: Get your API key from https://platform.openai.com/api-keys
- Create a
.envfile in the project root - Add your API key:
OPENAI_API_KEY=your_api_key_here
- Create a
-
Python Dependencies: Install required packages
pip install -r requirements.txt
-
Prepare Video Files: Place your MP4 educational videos in the
learning_videos/directory -
Process Videos: Convert MP4 files to JSON transcripts
python mp4_to_json.py
-
Create Embeddings: Generate embeddings for the video content
python preprocess.py
-
Ask Questions: Start the interactive Q&A system
python process_incoming.py
mp4_to_json.py: Converts MP4 videos to JSON transcripts using Whisperpreprocess.py: Creates embeddings from video transcriptsprocess_incoming.py: Main Q&A interfacelearning_videos/: Directory containing MP4 video filesjsons/: Directory containing JSON transcript filesembeddings.joblib: Precomputed embeddings for fast retrievalrequirements.txt: Python dependencies
-
Run the system in the correct order:
# Step 1: Convert videos to transcripts python mp4_to_json.py # Step 2: Create embeddings python preprocess.py # Step 3: Ask questions python process_incoming.py
-
The system will prompt you to ask questions about the video content
-
It will provide relevant video segments and timestamps for your questions
You can customize the system by editing the .env file:
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_CHAT_MODEL=gpt-3.5-turbo
OPENAI_MAX_TOKENS=1000
OPENAI_TEMPERATURE=0.7- OpenAI API errors: Check your API key and billing status
- NumPy compatibility issues: Make sure you have NumPy < 2.0 installed
- File not found errors: Check that you've run the preprocessing steps in order
- Memory issues: For large video files, consider using smaller Whisper models
- The system uses the
large-v2Whisper model by default for transcription - Embeddings are created using OpenAI's
text-embedding-3-smallmodel - Responses are generated using OpenAI's
gpt-3.5-turbomodel - All API calls include proper error handling and retry logic