AutoShow automates the processing of audio and video content from various sources, including YouTube videos, playlists, podcast RSS feeds, and local media files. It performs transcription, summarization, and chapter generation using different language models (LLMs) and transcription services.
The AutoShow workflow includes the following steps:
- The user provides a content input (video URL, playlist, RSS feed, or local file) and front matter is created based on the content's metadata.
- The audio is downloaded (if necessary).
- Transcription is performed using the selected service.
- A customizable prompt is inserted containing instructions for the contents of the show notes.
- The transcript is processed by the chosen LLM to generate show notes based on the selected prompts.
- Support for multiple input types (YouTube links, RSS feeds, local video and audio files)
- Integration with various:
- LLMs (ChatGPT, Claude, Gemini, Deepseek, Fireworks, Together)
- Transcription services (Whisper.cpp, Deepgram, Assembly)
- Local LLM support with Ollama
- Customizable prompts for generating titles, summaries, chapter titles/descriptions, key takeaways, and questions to test comprehension
- Markdown output with metadata and formatted content
- Command-line interface for easy usage
- WIP: Node.js server and React frontend
scripts/setup.sh
checks to ensure a .env
file exists, Node dependencies are installed, and the whisper.cpp
repository is cloned and built. Run the script with the setup
script in package.json
.
npm run setup
Run on a single YouTube video.
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk"
Run on a YouTube playlist.
npm run as -- --playlist "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr"
Run on a list of arbitrary URLs.
npm run as -- --urls "content/example-urls.md"
Run on a local audio or video file.
npm run as -- --file "content/audio.mp3"
Run on a podcast RSS feed.
npm run as -- --rss "https://ajcwebdev.substack.com/feed"
Use local LLM.
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --ollama
Use 3rd party LLM providers.
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --chatgpt gpt-4o-mini
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --claude CLAUDE_3_5_SONNET
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --gemini GEMINI_1_5_PRO
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --fireworks
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --together
Example commands for all available CLI options can be found in docs/examples.md
.
- Root-Level Files
tsconfig.json
: TypeScript configuration file specifying compiler options.railway.json
: Configuration for Railway deployment.package.json
: Contains project dependencies, scripts, and metadata..env.example
: Example environment variables file for configuration..dockerignore
: Specifies files/folders ignored by Docker during builds.
- Database Schema (
prisma
)prisma/schema.prisma
: Defines the Prisma ORM schema for database structure and models.
- Shared Resources (
shared
)shared/constants.ts
: Globally shared constants across multiple modules.
- GitHub Setup and Docker Configuration (
.github
)postgres-pgvector.Dockerfile
: Dockerfile for PostgreSQL with PGVector extension.Dockerfile
: Main Dockerfile for containerizing the application.docker-compose.yml
: Docker Compose configuration for local development.
- Setup Scripts (
.github/setup
):00-cleanup.sh
: Cleans previous build or setup environments.01-npm-and-env-vars.sh
: Installs npm packages and sets environment variables.02-homebrew.sh
: Installs dependencies using Homebrew.03-ollama.sh
: Installs and configures Ollama.04-whisper.sh
: Installs and configures Whisper transcription service.setup.sh
: Master script executing all setup scripts sequentially.
- Main Entry Points (
src
)commander.ts
: CLI setup using Commander.js.db.ts
: Initializes the database connection via Prisma.fastify.ts
: Sets up and configures the Fastify web server.
- Process Commands (
src/process-commands
)file.ts
: Processes local audio/video files.video.ts
: Processes single YouTube videos.urls.ts
: Processes videos listed in a URL file.playlist.ts
: Processes YouTube playlists.channel.ts
: Processes all videos from YouTube channels.rss.ts
: Processes podcast RSS feeds.
- Process Steps (
src/process-steps
)01-generate-markdown.ts
: Creates initial markdown file with metadata.02-download-audio.ts
: Downloads audio from YouTube videos.03-run-transcription.ts
: Manages transcription processes.04-select-prompt.ts
: Defines prompts for summarization and chapter creation.05-run-llm.ts
: Runs language model processes based on prompts.
- Transcription Services (
src/transcription
)whisper.ts
: Implements transcription with Whisper.cpp.deepgram.ts
: Integration with Deepgram API for transcription.assembly.ts
: Integration with AssemblyAI API for transcription.
- Language Models (
src/llms
)ollama.ts
: Integration with local Ollama models.chatgpt.ts
: Integration with OpenAI's GPT models.claude.ts
: Integration with Anthropic's Claude models.gemini.ts
: Integration with Google's Gemini models.fireworks.ts
: Integration with Fireworks open-source models.together.ts
: Integration with Together open-source models.deepseek.ts
: Integration with DeepSeek AI models.
- Utility Files (
src/utils
)create-clips.ts
: Utility to create video/audio clips.logging.ts
: Reusable logging utilities using Chalk for colorized output.types.ts
: Commonly used TypeScript types.
- Command-specific Utilities (
src/utils/command-utils
)channel-utils.ts
: Helpers specific to YouTube channel processing.rss-utils.ts
: Helpers for RSS feed processing.
- Embeddings Utilities (
src/utils/embeddings
)create-embed.ts
: Functions for creating embeddings.query-embed.ts
: Functions for querying embeddings.
- Image Generation Utilities (
src/utils/images
)black-forest-labs-generator.ts
: Integration for image generation with Black Forest Labs.dalle-generator.ts
: Integration for OpenAI's DALL·E image generation.stability-ai-generator.ts
: Integration for Stability AI image generation.combined-generator.ts
: Combines multiple image generators.utils.ts
: Common image-related helper functions.index.ts
: Centralized exports for image utilities.
- Step-specific Utilities (
src/utils/step-utils
)01-markdown-utils.ts
: Helpers for markdown generation step.02-save-audio.ts
: Helpers for saving downloaded audio.03-transcription-utils.ts
: Helpers for managing transcription outputs.04-prompts.ts
: Helpers for managing and selecting prompts.05-llm-utils.ts
: Helpers for interacting with language models.
- Validation Utilities (
src/utils/validation
)cli.ts
: CLI options validation and error handling.requests.ts
: Validation for incoming API requests.retry.ts
: Utility functions for retry logic and error handling.
- Web Frontend Configuration Files (
web
Module):astro.config.ts
: Configuration for Astro web application.package.json
: Dependencies and scripts for web frontend.tsconfig.json
: TypeScript configuration for web module.
- Web Source Files (
web/src
):env.d.ts
: Type declarations for environment variables.site.config.ts
: Site-specific configuration settings.types.ts
: Shared TypeScript types.styles/global.css
: Global CSS styles.
- Pages (
web/src/pages
):index.astro
: Homepage.404.astro
: 404 error page.show-notes/[id].astro
: Dynamic pages for individual show notes.
- Layouts (
web/src/layouts
):Base.astro
: Base layout used across pages.
- Components (
web/src/components
):BaseHead.astro
: Common HTML head elements.
- App Components (
web/src/components/app
):App.tsx
Form.tsx
ShowNote.tsx
ShowNotes.tsx
- Grouped Components (
web/src/components/app/groups
):LLMService.tsx
ProcessType.tsx
Prompts.tsx
TranscriptionService.tsx
- ✨Hello beautiful human! ✨Jenn Junod host of Teach Jenn Tech & Shit2TalkAbout