A Python script to bulk import recipes from a list of URLs into your Tandoor Recipes instance.
- Bulk import recipes from text file containing URLs
- Duplicate detection to avoid importing existing recipes
- Rate limiting handling with automatic retry
- URL validation to skip non-recipe links
- Detailed progress reporting and statistics
- Configurable delay between imports
- Two-step import process (scrape then create) for reliability
- Python 3.6+
requestslibrary- A running Tandoor Recipes instance
- Valid Tandoor API token
-
Clone or download this repository
-
Install required dependencies:
Option 1: Using your system package manager (recommended)
# Ubuntu/Debian sudo apt install python3-requests # Fedora/RHEL sudo dnf install python3-requests # Arch Linux sudo pacman -S python-requests
Option 2: Using pip in a virtual environment
python3 -m venv venv source venv/bin/activate pip install requestsOption 3: Using pip globally (not recommended on modern systems)
pip install requests
-
Copy the example configuration file and update it with your settings:
cp config.conf.example config.conf
-
Edit
config.confwith your settings:[tandoor] # Your Tandoor server URL (without trailing slash) url = https://your-tandoor-instance.com # Your Tandoor API token (found in Tandoor settings) api_token = your_api_token_here [import] # Delay between recipe imports (in seconds) delay_between_requests = 30
-
To get your API token:
- Log into your Tandoor instance
- Go to Settings → API
- Generate or copy your existing API token
Create a text file (e.g., url-list.txt) with one recipe URL per line:
https://www.allrecipes.com/recipe/123/example-recipe/
https://www.foodnetwork.com/recipes/another-recipe
https://www.kingarthurbaking.com/recipes/bread-recipe
Basic usage:
python3 tandoor_importer.py url-list.txtAdvanced usage with options:
# Start from a specific line (0-indexed)
python3 tandoor_importer.py url-list.txt 100
# Limit number of imports
python3 tandoor_importer.py url-list.txt 0 50
# Start from line 100 and import max 25 recipes
python3 tandoor_importer.py url-list.txt 100 25url_file- Path to text file containing recipe URLs (required)start_index- Line number to start from (optional, default: 0)max_imports- Maximum number of recipes to import (optional, default: all)
The script automatically filters out:
- Image files (.jpg, .png, etc.)
- Video files (.mp4, .mov, etc.)
- Document files (.pdf, .doc, etc.)
- Social media direct links
- Other non-recipe content
- Fetches existing recipes from your Tandoor instance
- Compares source URLs to avoid importing duplicates
- Shows count of skipped duplicates in progress report
- Respects Tandoor's rate limits
- Automatically waits and retries when rate limited
- Configurable delay between requests (default: 30 seconds)
The script handles various error scenarios:
- Connection timeouts
- Invalid URLs
- Non-recipe pages
- Server errors
- Rate limiting
Real-time statistics showing:
- Import progress (current/total)
- Success rate percentage
- Breakdown of results (successful, duplicates, errors, etc.)
The script provides detailed console output including:
- Configuration validation
- URL filtering results
- Real-time import progress
- Detailed error messages
- Final statistics summary
-
"Configuration file not found"
- Ensure
config.confexists in the same directory as the script - Check file permissions
- Ensure
-
"Please configure your API token"
- Update
config.confwith your actual Tandoor API token - Verify the token is valid in your Tandoor instance
- Update
-
Rate limiting errors
- Increase
delay_between_requestsin config.conf - The script handles rate limiting automatically, but longer delays reduce likelihood
- Increase
-
Connection errors
- Verify your Tandoor URL is correct and accessible
- Check network connectivity
- Ensure Tandoor instance is running
- Check Tandoor's API documentation
- Verify your Tandoor instance is up to date
- Test API token with a simple curl request:
curl -H "Authorization: Bearer YOUR_TOKEN" https://your-tandoor-instance.com/api/recipe/?page_size=1
The project includes a comprehensive test suite with high code coverage:
# Run the comprehensive test suite
python test_tandoor_importer.py
# Run with coverage (if coverage is installed)
coverage run test_tandoor_importer.py
coverage report
coverage html # Generate HTML coverage reportThe test suite covers:
- Configuration loading and validation - File parsing, URL validation, error handling
- URL validation logic - Recipe URL detection, invalid URL filtering
- File operations - Reading, encoding, size limits, permission handling
- Network operations - Retry logic, authentication, rate limiting
- Error handling - Custom exceptions, graceful failure modes
- Logging functionality - Console and file output
Tests are automatically run in GitHub Actions across Python 3.9-3.12 with:
- Syntax validation
- Code linting (ruff)
- Security scanning (bandit)
- Comprehensive test execution with coverage reporting
This project is licensed under the MIT License - see the LICENSE file for details.