Transform YouTube playlists into structured study material.
An automated tool that downloads videos, extracts subtitles (native or AI-powered), and generates consolidated educational material — all in a single command.
Watching hours of educational videos is time-consuming. This project solves this problem by:
- Downloading videos or audio from YouTube playlists
- Obtaining subtitles automatically (prioritizes native subtitles; uses Whisper AI as fallback)
- Generating study material consolidated via GPT — summaries, key concepts, practical examples, and glossary
Result: A complete Markdown document that replaces the need to watch the videos.
"Transform 10 hours of video into 30 minutes of focused reading."
The generated material is not a simple summary — it's a complete educational document structured by AI to maximize your learning:
📚 Study Material - [Playlist Name]
├── 📌 Executive Summary
│ └── Overview of all content in a few paragraphs
├── 🔑 Key Concepts
│ └── Definitions, context, relationships, and examples for each concept
├── 🎬 Content by Video
| |── Individual summary of each video
| |── Tips and best practices
│ └── Detailed analysis preserving the original sequence
├── 💡 Examples and Practical Cases
│ └── Code, diagrams, data models, APIs
├── ✏️ Exercises and Action Points
| |── Suggested projects for applying concepts
│ └── Practical activities for reinforcement
├── 📖 Technical Glossary
│ └── Important terms with clear definitions
├── 📚 References and Resources
│ └── Links for deeper learning
└── 📎 Appendices
└── Templates, snippets, comparison tables, described flowcharts
| Problem | Solution |
|---|---|
| ⏰ Lack of time | Absorb hours of video content in minutes |
| 🔄 Difficult review | Searchable document — find any concept instantly |
| 📝 Scattered notes | Everything consolidated in a single Markdown file |
| 🌍 Language | Generate material in your language, even from foreign videos |
| 💾 Offline | Study without internet, print, export to PDF |
| 🎓 Active learning | Exercises and practical examples included |
- Students: Exam preparation from recorded classes
- Professionals: Quick training on new technologies
- Companies: Internal training documentation
- Content creators: Foundation for articles, posts, and derivative courses
- Researchers: Systematic analysis of video content
From a playlist with 2 videos (https://www.youtube.com/watch?v=HA414QD3qFw / https://www.youtube.com/watch?v=rNu1gUDnkuY) (~2 min each), the system generated:
- 738 lines of structured content
- 13 key concepts with complete definitions
- 1 detailed case study (ClickTravel) with architecture and APIs
- Practical exercises and action checklist
- Glossary with 20+ technical terms
Cost: ~$0.03 (GPT) | Time: ~2 minutes | Value: Priceless ✨
| Feature | Description |
|---|---|
| 📥 Intelligent download | Downloads videos/audio with rate-limiting control |
| 📝 Automatic subtitles | Prioritizes YouTube subtitles; uses Whisper AI if unavailable |
| 🔄 Checkpoint/Resume | Interrupt and resume at any time (safe Ctrl+C) |
| 📚 Study material | Generates complete educational document via GPT |
| 🌍 Intelligent multi-language | Detects OS language, selects subtitles by priority, avoids duplicates |
| 🎵 Audio mode | Option to download audio only (space savings) |
- Python 3.10 or higher
- FFmpeg and ffprobe installed and in PATH
- OpenAI API Key (for Whisper transcription and material generation)
- Get it at: https://platform.openai.com/account/api-keys (step-by-step guide below)
- Access OpenAI Platform.
- Log in or create an account.
- In the dashboard, go to "API Keys" in the side menu.
- Click "Create new secret key".
- Copy the generated key (starts with "sk-...") and save it in a secure location.
- Use this key to configure the
OPENAI_API_KEYenvironment variable or pass via--api-keyparameter.
- Linux/macOS:
export OPENAI_API_KEY="sk-..."
- Windows CMD:
set OPENAI_API_KEY=sk-...
- Windows PowerShell:
$env:OPENAI_API_KEY="sk-..."
Windows (via winget):
winget install FFmpeg.FFmpegWindows (via Chocolatey):
choco install ffmpegmacOS:
brew install ffmpegLinux (Debian/Ubuntu):
sudo apt install ffmpeg- Clone the repository:
git clone https://github.com/your-username/yt-playlist-summary.git
cd yt-playlist-summary- Create a virtual environment (recommended):
python -m venv venv
venv\Scripts\activate # Windows
# or
source venv/bin/activate # Linux/macOS- Install dependencies:
pip install -r requirements.txt- Configure the API key:
# Option 1: Environment variable
export OPENAI_API_KEY="sk-..." # Linux/macOS
set OPENAI_API_KEY=sk-... # Windows CMD
$env:OPENAI_API_KEY="sk-..." # Windows PowerShell
# Option 2: .env file at project root
echo OPENAI_API_KEY=sk-... > .envpython yt_playlist_summary.py --url "PLAYLIST_URL"What happens by default:
- ✅ Downloads all videos in the playlist
- ✅ Searches for native subtitles (pt-BR, en)
- ✅ If no subtitles found → transcribes via Whisper AI
- ✅ Generates consolidated study material
- ✅ Checkpoint enabled (can interrupt and resume)
# Process complete playlist (default behavior)
python yt_playlist_summary.py --url "https://youtube.com/playlist?list=..."
# Interactive mode (confirms before each step)
python yt_playlist_summary.py --url "URL" --interactive
# Audio only (space savings)
python yt_playlist_summary.py --url "URL" --audio-only
# Force Whisper usage (ignore native subtitles)
python yt_playlist_summary.py --url "URL" --no-prefer-existing-subtitles
# No study material (download + subtitles only)
python yt_playlist_summary.py --url "URL" --no-study-material
# Clear checkpoint and reprocess everything
python yt_playlist_summary.py --url "URL" --clear-checkpoint
# Specify source language for subtitles (priority)
python yt_playlist_summary.py --url "URL" --source-language pt-BR,en
# Study material in English from Portuguese subtitles
python yt_playlist_summary.py --url "URL" --source-language pt-BR --study-language en
# Material in English using English subtitles
python yt_playlist_summary.py --url "URL" --source-language en --study-language en
# Material in Portuguese using English subtitles (automatic translation)
python yt_playlist_summary.py --url "URL" --source-language en --study-language pt
# Force specific language (ignore OS detection)
python yt_playlist_summary.py --url "URL" --source-language ja,en --study-language jaoutput/
├── downloads/ # Original videos/audio
├── audio/ # Extracted audio (when needed)
├── converted/ # 64kbps mono audio (for Whisper)
├── subtitles/ # .srt files
├── study_material_*.md # Generated study material
└── .checkpoint_*.json # Progress (for resume)
| Parameter | Default | Description |
|---|---|---|
-u, --url |
required | Playlist or video URL |
-k, --api-key |
env OPENAI_API_KEY |
OpenAI API key |
-o, --output |
./output |
Output directory |
-l, --language |
auto-detect | Language for Whisper transcription |
-a, --audio-only |
False |
Download audio only |
-i, --interactive |
False |
Interactive mode with confirmations |
-v, --verbose |
False |
Detailed logs |
--subtitle-languages |
pt-BR,en |
Languages to search for subtitles |
--download-delay |
5 |
Seconds between downloads |
--keep-original |
False |
Keep audio without conversion |
--skip-transcription |
False |
Skip subtitle step |
--no-prefer-existing-subtitles |
False |
Force Whisper (ignore native subtitles) |
--no-study-material |
False |
Do not generate study material |
--source-language |
OS language | Source subtitle language(s) (e.g., pt-BR,en) |
--study-language |
OS language | Output material language |
--no-checkpoint |
False |
Disable checkpoint |
--clear-checkpoint |
False |
Clear checkpoint and restart |
The project saves progress automatically. If interrupted (Ctrl+C), just run the same command again:
# First run - interrupted at video 5/20
python yt_playlist_summary.py --url "URL"
# ^C
# Second run - resumes from video 6
python yt_playlist_summary.py --url "URL"
# 🔄 RESUMING DOWNLOAD
# ✅ Already completed: 5/20python translate_sub.py \
--input ./output/subtitles/video.pt-BR.srt \
--source pt-BR \
--target en# Use system defaults (detects OS language)
python generate_study_material.py -s ./output/subtitles
# Specify source and output language
python generate_study_material.py \
--subtitle-dir ./output/subtitles \
--source-language pt-BR,en \
--output-language pt
# Interactive mode (asks for languages)
python generate_study_material.py -s ./output/subtitles -i
# Consolidate only (without GPT)
python generate_study_material.py -s ./output/subtitles --skip-gptpython mywhisper.py --input audio.mp3python rename_from_checkpoint.py \
--checkpoint output/.checkpoint_abc123.json| Operation | Approximate Cost |
|---|---|
| Whisper (transcription) | ~$0.006 per minute of audio |
| GPT (study material) | ~$0.02-0.05 per typical playlist (5-10 videos) |
Tip: Use --prefer-existing-subtitles (default) to save money — native subtitles are free!
yt-playlist-summary/
├── yt_playlist_summary.py # Main pipeline orchestrator
├── mywhisper.py # Whisper transcription + cache
├── generate_study_material.py # Educational material generation
├── language_utils.py # OS language detection and intelligent selection
├── checkpoint_manager.py # Checkpoint/resume system
├── translate_sub.py # SRT translation via GPT
├── rename_from_checkpoint.py # Renaming utility
├── requirements.txt # Python dependencies
└── README.md
The system automatically detects your operating system language and configures defaults:
| Portuguese OS | English OS |
|---|---|
Source: pt-BR, pt, und |
Source: en-US, en, und |
Output: pt |
Output: en |
- Groups subtitles by video — identifies index by filename
- Selects one subtitle per video — uses configured language priority
- Avoids duplicates — saves GPT tokens!
Practical example:
Subtitles/
├── 1. Intro.en.srt
├── 1. Intro.pt-BR.srt ← selected (pt-BR has priority)
├── 2. Review.en.srt
└── 2. Review.pt-BR.srt ← selected
Result: 2 subtitles processed instead of 4!
pt, pt-BR, en, en-US, es, fr, de, it, ja, zh, ko, ru, ar, hi
| Problem | Solution |
|---|---|
FFmpeg not found |
Install FFmpeg and add to PATH |
API key not found |
Configure OPENAI_API_KEY via env or --api-key |
| Rate-limiting error | Increase --download-delay (e.g., 10 or 15) |
| Private/unavailable video | Script automatically skips and continues |
| Corrupted checkpoint | Use --clear-checkpoint to restart |
If this project has already saved you hours of YouTube videos, imagine what it does with a coffee. Support a developer who trades sleep for lines of code — and help this project continue preventing you from watching 3-hour lectures at 12 different speeds.
If you enjoyed it, consider buying me a coffee. I promise to spend it on caffeine… and maybe more features.
Contributions are welcome! Please maintain separation of responsibilities:
yt_playlist_summary.py→ download and preprocessingmywhisper.py→ transcription and subtitle manipulation- New modules → independent features
Made with ❤️ to make learning more efficient.