Convert markdown articles to podcast episodes with automated publishing to RSS feed.
- Install dependencies:
bundle install- Configure environment variables (copy
.env.exampleto.env):- Google Cloud credentials path
- GCS bucket name and base URL
- Podcast metadata (title, description, author, etc.)
PODCAST_ID(required) - see below
All podcast data is isolated by podcast_id. Each podcast has its own storage directory and feed.
Add to your .env:
# Generate a unique podcast ID (16 hex characters)
PODCAST_ID=podcast_$(openssl rand -hex 8)Example: PODCAST_ID=podcast_a1b2c3d4e5f6a7b8
Note: This ID is permanent for your podcast. Keep it in .env and never change it after publishing episodes.
podcasts/{podcast_id}/
├── episodes/{episode_id}.mp3
├── feed.xml
├── manifest.json
└── staging/{filename}.md
Each podcast has its own feed URL:
https://storage.googleapis.com/{bucket}/podcasts/{podcast_id}/feed.xml
Generate and publish a podcast episode locally:
ruby generate.rb input/article.mdThis creates audio content, streams it to GCS, updates the episode manifest, and regenerates the RSS feed.
Submit an episode for processing via the deployed API:
# Get identity token (from authorized service)
TOKEN=$(gcloud auth print-identity-token)
curl -X POST https://podcast-api-ns2hvyzzra-wm.a.run.app/publish \
-H "Authorization: Bearer $TOKEN" \
-F "podcast_id=podcast_abc123xyz" \
-F "title=Episode Title" \
-F "author=Author Name" \
-F "description=Episode description" \
-F "content=@input/article.md"Current Service URL (us-west3): https://podcast-api-ns2hvyzzra-wm.a.run.app
Response:
{"status":"success","message":"Episode submitted for processing"}The API processes episodes asynchronously via Cloud Tasks. Check Cloud Run logs to monitor processing status.
# Generate locally without publishing
ruby generate.rb --local-only input/article.md
# Use a different voice
ruby generate.rb -v en-US-Chirp3-HD-Galahad input/article.md
# See all options
ruby generate.rb --helpInput files should be markdown documents with YAML frontmatter at the top. Use the /generate-input-md slash command in Claude Code to create properly formatted input files.
All input markdown files must include YAML frontmatter with the following fields:
---
title: "Your Episode Title"
description: "A brief description of the episode content"
author: "Author Name"
---Required fields:
title: The episode title (enclosed in quotes if it contains special characters)description: A short description of the episode (enclosed in quotes)author: The author's name (enclosed in quotes)
Example:
---
title: "The New Calculus of AI-based Coding"
description: "An exploration of how AI-assisted development can achieve 10x productivity gains, and why succeeding at this scale requires fundamental changes to testing, deployment, and team coordination practices."
author: "Joe Magerramov"
---After the frontmatter, include your markdown content. The system will strip markdown formatting (headers, bold, links, etc.) and convert it to plain text suitable for text-to-speech processing.
- Text-to-Speech: Converts markdown to natural-sounding audio using Google Cloud TTS
- Podcast Publishing: Automatically publishes to RSS feed with iTunes tags
- Episode Management: Tracks all episodes in a manifest with metadata
- Cloud Storage: Uploads audio files to Google Cloud Storage
- Frontmatter Support: Extracts title, description, and author from YAML frontmatter
- Chunking & Concurrency: Handles long documents with parallel processing
- Error Handling: Automatic retry for rate limits, timeouts, and content filters
Run all tests:
rake testRun RuboCop linter:
rake rubocopThe API is deployed to Google Cloud Run for asynchronous episode processing.
./bin/deploySee docs/deployment.md for detailed deployment instructions, architecture overview, and troubleshooting guide.
Symptom: A failed episode keeps retrying for up to an hour instead of stopping after 3 attempts.
Root Cause: Cloud Tasks maxAttempts and maxRetryDuration interact unexpectedly. When both are set, retries continue until BOTH conditions are satisfied:
- 3 attempts have been made AND
- The retry duration (e.g., 1 hour) has elapsed
Since failed tasks complete quickly, 3 attempts happen in seconds, but the duration hasn't elapsed - so retries continue with exponential backoff.
Fix: Remove maxRetryDuration so only maxAttempts limits retries:
gcloud tasks queues update episode-processing \
--location=us-west3 \
--max-attempts=3 \
--max-retry-duration=0sVerify the fix:
gcloud tasks queues describe episode-processing --location=us-west3The output should show maxAttempts: 3 without maxRetryDuration.
View recent processing logs:
gcloud logging read 'resource.type="cloud_run_revision"' --limit=50 \
--format="table(timestamp,textPayload)"Filter by episode:
gcloud logging read 'resource.type="cloud_run_revision" AND textPayload=~"episode_id=123"' \
--limit=100 --format="table(timestamp,textPayload)"Count processing attempts for an episode:
gcloud logging read 'resource.type="cloud_run_revision" AND textPayload=~"processing_started.*episode_id=123"' \
--limit=50 --format="value(timestamp)" | wc -l- Episode duration not tracked: The
duration_secondsfield in the Episode model is not populated. Would require MP3 parsing (e.g.,mp3infogem) to extract duration from generated audio files. - No retry on callback failure: If Generator fails to notify Hub of completion/failure, the episode stays in "processing" state. No automatic retry mechanism.
- No processing timeout: Episodes stuck in "processing" state are not automatically marked as failed.
- Single podcast per user: Hub currently assumes one podcast per user. Multi-podcast support would require UI changes.
MIT