Skip to content

Latest commit

Β 

History

History
1053 lines (765 loc) Β· 32.9 KB

File metadata and controls

1053 lines (765 loc) Β· 32.9 KB

βš™οΈ Configuration Guide

Complete configuration reference for all Discogsography services

🏠 Back to Main | πŸ“š Documentation Index | πŸš€ Quick Start

Overview

Discogsography uses environment variables for all configuration. This approach provides flexibility for different deployment environments (development, staging, production) without code changes.

Configuration Methods

1. Environment File (.env) β€” Development

The recommended approach for local development is a .env file:

# Copy the example file
cp .env.example .env

# Edit with your settings
nano .env

Production: Do not use .env files with real credentials in production. Use Docker Compose runtime secrets instead β€” see Production Secrets below.

2. Direct Environment Variables β€” Development

Export variables in your shell:

export RABBITMQ_HOST="localhost"
export RABBITMQ_USERNAME="discogsography"
export RABBITMQ_PASSWORD="discogsography"
export NEO4J_HOST="localhost"
# ... other variables

3. Docker Compose Runtime Secrets β€” Production

In production, credentials are mounted as in-memory tmpfs files via docker-compose.prod.yml. Secret values are never visible in docker inspect, never written to disk, and flushed when the container stops.

# Generate secrets once (idempotent)
bash scripts/create-secrets.sh

# Start with production overlay
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d

See Production Secrets below and Docker Security for full details.

4. Docker Compose Override

Override non-secret settings in docker-compose.yml or docker-compose.override.yml:

services:
  dashboard:
    environment:
      - LOG_LEVEL=DEBUG
      - REDIS_HOST=redis

Core Settings

RabbitMQ Configuration

RabbitMQ connections are configured using individual component variables.

Variable Description Default Required
RABBITMQ_HOST RabbitMQ hostname rabbitmq No
RABBITMQ_PORT RabbitMQ AMQP port 5672 No
RABBITMQ_USERNAME RabbitMQ username discogsography No
RABBITMQ_PASSWORD RabbitMQ password discogsography No

Used By: Extractor, Graphinator, Tableinator, Brainzgraphinator, Brainztableinator, Dashboard

Secret convention: RABBITMQ_USERNAME_FILE / RABBITMQ_PASSWORD_FILE paths are supported for Docker Compose runtime secrets.

Examples:

# Local development
RABBITMQ_HOST=localhost
RABBITMQ_USERNAME=discogsography
RABBITMQ_PASSWORD=discogsography

# Docker Compose (internal network β€” these are the defaults)
RABBITMQ_HOST=rabbitmq

# Remote broker with custom credentials
RABBITMQ_HOST=rabbitmq.example.com
RABBITMQ_USERNAME=myuser
RABBITMQ_PASSWORD=mypassword

Connection Properties:

  • Automatic reconnection on failure
  • Heartbeat: 60 seconds
  • Connection timeout: 30 seconds
  • Prefetch count: 100 (configurable per service)

Data Storage

Variable Description Default Required
DISCOGS_ROOT Data storage path /discogs-data Yes
PERIODIC_CHECK_DAYS Update check interval 15 No

Used By: Extractor

DISCOGS_ROOT Details:

  • Must be writable by service user (UID 1000 in Docker)
  • Requires ~76GB free space for full dataset
  • Contains downloaded XML files and metadata cache

Directory Structure:

/discogs-data/
β”œβ”€β”€ artists/
β”‚   β”œβ”€β”€ discogs_20250115_artists.xml.gz
β”‚   └── .metadata_artists.json
β”œβ”€β”€ labels/
β”‚   β”œβ”€β”€ discogs_20250115_labels.xml.gz
β”‚   └── .metadata_labels.json
β”œβ”€β”€ releases/
β”‚   β”œβ”€β”€ discogs_20250115_releases.xml.gz
β”‚   └── .metadata_releases.json
└── masters/
    β”œβ”€β”€ discogs_20250115_masters.xml.gz
    └── .metadata_masters.json

PERIODIC_CHECK_DAYS:

  • How often to check for new data dumps
  • Set to 0 to disable automatic checks
  • Recommended: 15 (checks twice per month)

Database Connections

Neo4j Configuration

Variable Description Default Required
NEO4J_HOST Neo4j hostname localhost Yes
NEO4J_USERNAME Neo4j username neo4j Yes
NEO4J_PASSWORD Neo4j password (none) Yes

Used By: Graphinator, API, Schema-Init, Dashboard, Brainzgraphinator

Connection Details:

  • Protocol: Bolt (binary protocol)
  • Default port: 7687
  • Connection pool: 50 connections (max)
  • Retry logic: Exponential backoff (max 5 attempts)
  • Transaction timeout: 60 seconds

Examples:

# Local development
NEO4J_HOST="localhost"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="password"

# Docker Compose
NEO4J_HOST="neo4j"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="discogsography"

# Neo4j Aura (cloud) β€” full URI supported
NEO4J_HOST="xxxxx.databases.neo4j.io"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="your-secure-password"

Security Notes:

  • Use strong passwords in production
  • Enable encryption with bolt+s:// or bolt+ssc://
  • Consider certificate validation for production
  • Rotate credentials regularly

PostgreSQL Configuration

Variable Description Default Required
POSTGRES_HOST PostgreSQL hostname localhost Yes
POSTGRES_USERNAME PostgreSQL username postgres Yes
POSTGRES_PASSWORD PostgreSQL password (none) Yes
POSTGRES_DATABASE Database name discogsography Yes

Used By: Tableinator, Dashboard, API, Insights, Brainztableinator, Schema-Init

Connection Details:

  • Protocol: PostgreSQL wire protocol
  • Default port: 5432 (mapped to 5433 in Docker)
  • Connection pool: 20 connections (max)
  • Retry logic: Exponential backoff (max 5 attempts)
  • Query timeout: 30 seconds

Examples:

# Local development
POSTGRES_HOST="localhost"
POSTGRES_USERNAME="discogsography"
POSTGRES_PASSWORD="discogsography"
POSTGRES_DATABASE="discogsography"

# Docker Compose
POSTGRES_HOST="postgres"
POSTGRES_USERNAME="discogsography"
POSTGRES_PASSWORD="discogsography"
POSTGRES_DATABASE="discogsography"

# Remote server
POSTGRES_HOST="db.example.com"
POSTGRES_USERNAME="app_user"
POSTGRES_PASSWORD="secure-password"
POSTGRES_DATABASE="discogsography_prod"

Performance Tuning:

# For services using asyncpg
POSTGRES_POOL_MIN=10
POSTGRES_POOL_MAX=20
POSTGRES_COMMAND_TIMEOUT=30

Redis Configuration

Variable Description Default Required
REDIS_HOST Redis hostname localhost Yes (Dashboard, API, Insights)

Used By: Dashboard, API, Insights

Connection Details:

  • Protocol: Redis protocol
  • Default port: 6379
  • Database: 0 (default)
  • Connection pool: Automatic
  • Retry logic: Exponential backoff

Examples:

# Local development
REDIS_HOST="localhost"

# Docker Compose
REDIS_HOST="redis"

Cache Configuration:

  • Default TTL: 3600 seconds (1 hour)
  • Max memory: 512MB (configurable in docker-compose.yml)
  • Eviction policy: allkeys-lru
  • Persistence: Disabled (cache only)

JWT Configuration

Variable Description Default Required
JWT_SECRET_KEY HMAC-SHA256 signing secret (none) Yes (API)
JWT_EXPIRE_MINUTES Token lifetime in minutes 30 No
DISCOGS_USER_AGENT User-Agent for Discogs API requests (none) Yes (API)

Used By: API

JWT Details:

  • Algorithm: HS256 (HMAC-SHA256)
  • Token format: Standard JWT (header.body.signature, base64url-encoded)
  • Payload claims: sub (user UUID), email, iat, exp

Security Notes:

  • Use a cryptographically random secret of at least 32 bytes in production
  • Rotate the secret to invalidate all existing tokens
  • Never log or expose JWT_SECRET_KEY
  • In production, supply via JWT_SECRET_KEY_FILE pointing to a Docker secret file β€” see Production Secrets

Examples:

# Generate a secure random secret (Linux/macOS)
JWT_SECRET_KEY=$(openssl rand -hex 32)

# Set token lifetime
JWT_EXPIRE_MINUTES=30     # 30 minutes (default)
JWT_EXPIRE_MINUTES=60     # 1 hour
JWT_EXPIRE_MINUTES=1440   # 24 hours

# Discogs User-Agent (required for Discogs API)
DISCOGS_USER_AGENT="Discogsography/1.0 +https://github.com/SimplicityGuy/discogsography"

Logging Configuration

Variable Description Default Valid Values
LOG_LEVEL Logging verbosity INFO DEBUG, INFO, WARNING, ERROR, CRITICAL

Used By: All services

Log Level Behavior:

Level Output Use Case
DEBUG All logs + query details + internal state Development, debugging
INFO Normal operation logs Production (default)
WARNING Warnings and errors Production (minimal)
ERROR Errors only Production (alerts)
CRITICAL Critical errors only Production (severe)

Examples:

# Development - see everything
LOG_LEVEL=DEBUG

# Production - normal operation
LOG_LEVEL=INFO

# Production - minimal logging
LOG_LEVEL=WARNING

Debug Level Features:

  • Neo4j query logging with parameters
  • PostgreSQL query logging
  • RabbitMQ message details
  • Cache hit/miss statistics
  • Internal state transitions

See Logging Guide for detailed logging information.

Consumer Management

Variable Description Default Range
CONSUMER_CANCEL_DELAY Seconds before canceling idle consumers 300 (5 min) 60-3600
QUEUE_CHECK_INTERVAL Seconds between queue checks when idle 3600 (1 hr) 300-86400
STUCK_CHECK_INTERVAL Seconds between stuck-state checks 30 5-300

Used By: Graphinator, Tableinator, Brainzgraphinator, Brainztableinator

Purpose: Smart resource management for RabbitMQ connections

How It Works:

  1. Service processes messages from all queues
  2. When all queues are empty for CONSUMER_CANCEL_DELAY seconds:
    • Close RabbitMQ connections
    • Stop progress logging
    • Enter "waiting" mode
  3. Every QUEUE_CHECK_INTERVAL seconds:
    • Briefly connect to RabbitMQ
    • Check all queues for new messages
    • If messages found, restart consumers
  4. When new messages detected:
    • Reconnect to RabbitMQ
    • Resume all consumers
    • Resume progress logging

Configuration Examples:

# Aggressive resource saving (testing)
CONSUMER_CANCEL_DELAY=60     # 1 minute
QUEUE_CHECK_INTERVAL=300     # 5 minutes

# Balanced (default)
CONSUMER_CANCEL_DELAY=300    # 5 minutes
QUEUE_CHECK_INTERVAL=3600    # 1 hour

# Conservative (always connected)
CONSUMER_CANCEL_DELAY=3600   # 1 hour
QUEUE_CHECK_INTERVAL=300     # 5 minutes (doesn't matter if rarely triggered)

See Consumer Cancellation for details.

Batch Processing Configuration

Variable Description Code Default Docker Compose Range
NEO4J_BATCH_MODE Enable batch processing for Neo4j writes true true true/false
NEO4J_BATCH_SIZE Records per batch for Neo4j 100 500 10-1000
NEO4J_BATCH_FLUSH_INTERVAL Seconds between automatic flushes 5.0 2.0 1.0-60.0
POSTGRES_BATCH_MODE Enable batch processing for PostgreSQL true true true/false
POSTGRES_BATCH_SIZE Records per batch for PostgreSQL 100 500 10-1000
POSTGRES_BATCH_FLUSH_INTERVAL Seconds between automatic flushes 5.0 2.0 1.0-60.0

Used By: Graphinator (Neo4j), Tableinator (PostgreSQL), Brainzgraphinator (Neo4j), Brainztableinator (PostgreSQL)

Purpose: Improve write performance by batching multiple database operations

How Batch Processing Works:

  1. Messages are collected into batches instead of being processed individually
  2. When batch reaches BATCH_SIZE or BATCH_FLUSH_INTERVAL expires:
    • All records in batch are written in a single database operation
    • Message acknowledgments are sent after successful write
  3. On service shutdown:
    • All pending batches are flushed automatically
    • No data loss occurs during graceful shutdown

Performance Impact:

  • Throughput: 3-5x improvement in write operations per second
  • Database Load: Fewer transactions, reduced connection overhead
  • Latency: Slight increase (up to BATCH_FLUSH_INTERVAL seconds)
  • Memory: Increased by approximately BATCH_SIZE * record_size

Configuration Examples:

# High throughput (recommended for initial data load)
NEO4J_BATCH_MODE=true
NEO4J_BATCH_SIZE=500
NEO4J_BATCH_FLUSH_INTERVAL=10.0

# Low latency (real-time updates)
NEO4J_BATCH_MODE=true
NEO4J_BATCH_SIZE=10
NEO4J_BATCH_FLUSH_INTERVAL=1.0

# Disabled (per-message processing)
NEO4J_BATCH_MODE=false
# BATCH_SIZE and FLUSH_INTERVAL ignored when disabled

Best Practices:

  • Initial Load: Use larger batch sizes (500-1000) for faster initial data loading
  • Real-time Updates: Use smaller batches (10-50) for lower latency
  • Memory Constrained: Reduce batch size if memory usage is a concern
  • High Throughput: Increase flush interval to accumulate more records
  • Testing: Disable batch mode to debug individual record processing

Monitoring:

Batch processing logs provide visibility into performance:

πŸš€ Batch processing enabled (batch_size=100, flush_interval=5.0)
πŸ“¦ Flushing batch for artists (size=100)
βœ… Batch flushed successfully (artists: 100 records in 0.45s)

See Performance Guide for detailed optimization strategies.

Dashboard Configuration

Variable Description Default Required
RABBITMQ_MANAGEMENT_USER RabbitMQ management API username discogsography No
RABBITMQ_MANAGEMENT_PASSWORD RabbitMQ management API password discogsography No
CORS_ORIGINS Comma-separated list of allowed CORS origins (none β€” disabled) No
CACHE_WARMING_ENABLED Pre-warm cache on startup true No
CACHE_WEBHOOK_SECRET Secret for cache invalidation webhooks (none β€” disabled) No
API_HOST API service hostname for admin proxy api No
API_PORT API service port for admin proxy 8004 No

Used By: Dashboard only (for RABBITMQ_MANAGEMENT_USER, CACHE_WARMING_ENABLED, CACHE_WEBHOOK_SECRET); CORS_ORIGINS is also supported by the API service β€” see the API section above. API_HOST / API_PORT are used by the admin proxy router to forward authenticated admin requests to the API service.

Notes:

  • RABBITMQ_MANAGEMENT_USER / RABBITMQ_MANAGEMENT_PASSWORD must match the credentials set in RabbitMQ for the management plugin. In production these are supplied via RABBITMQ_MANAGEMENT_USER_FILE / RABBITMQ_MANAGEMENT_PASSWORD_FILE (Docker secrets)
  • CORS_ORIGINS is optional; omit it to restrict cross-origin access
  • CACHE_WEBHOOK_SECRET enables an authenticated endpoint to invalidate cached queries
  • API_HOST / API_PORT configure the admin proxy β€” the dashboard forwards admin panel requests (login, extraction trigger, DLQ purge) to the API service at http://{API_HOST}:{API_PORT}

Python Version

Variable Description Default Used By
PYTHON_VERSION Python version for builds 3.13 Docker, CI/CD

Used By: Build systems, CI/CD pipelines

Purpose: Ensure consistent Python version across environments

Notes:

  • Minimum supported version: 3.13
  • All services tested on Python 3.13+
  • Older versions not supported

Service-Specific Settings

API

# Required
POSTGRES_HOST="localhost"
POSTGRES_USERNAME="discogsography"
POSTGRES_PASSWORD="discogsography"
POSTGRES_DATABASE="discogsography"
REDIS_HOST="localhost"
JWT_SECRET_KEY="your-secret-key-here"
DISCOGS_USER_AGENT="Discogsography/1.0 +https://github.com/SimplicityGuy/discogsography"

# Optional β€” HKDF master encryption key (derives OAuth + TOTP encryption keys)
# Required for TOTP 2FA. Without it, OAuth tokens are stored unencrypted and 2FA is disabled.
# Generate with: uv run python -c 'import base64, os; print(base64.urlsafe_b64encode(os.urandom(32)).decode())'
# In production, supply via ENCRYPTION_MASTER_KEY_FILE (Docker secret)
ENCRYPTION_MASTER_KEY="your-master-key-here"

# Optional β€” Brevo transactional email for password reset notifications
# When not set, password reset links are logged to stdout instead of emailed.
# Get your API key from https://app.brevo.com/settings/keys/api
# BREVO_API_KEY="your-brevo-api-key"
# BREVO_SENDER_EMAIL="noreply@yourdomain.com"   # Must be verified in Brevo
# BREVO_SENDER_NAME="Discogsography"

# Optional β€” CORS origins (comma-separated; omit to disable CORS)
CORS_ORIGINS="http://localhost:8003,http://localhost:8006"

# Optional β€” snapshot settings
SNAPSHOT_TTL_DAYS=28     # Snapshot expiry in days (default: 28)
SNAPSHOT_MAX_NODES=100   # Max nodes per snapshot (default: 100)

# Optional
JWT_EXPIRE_MINUTES=1440
LOG_LEVEL=INFO

Health check: http://localhost:8004/health (service), http://localhost:8005/health (health check port)

Notes: After startup, set Discogs app credentials using the discogs-setup CLI bundled in the API container:

docker exec <api-container> discogs-setup \
  --consumer-key YOUR_CONSUMER_KEY \
  --consumer-secret YOUR_CONSUMER_SECRET

# Verify (values are masked)
docker exec <api-container> discogs-setup --show

See the API README for full setup instructions.

Schema-Init

# Required
NEO4J_HOST="localhost"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="discogsography"
POSTGRES_HOST="localhost"
POSTGRES_USERNAME="discogsography"
POSTGRES_PASSWORD="discogsography"
POSTGRES_DATABASE="discogsography"

# Optional
LOG_LEVEL=INFO

Notes: Schema-init is a one-shot initializer β€” it exits 0 on success and 1 on failure. It has no health check port. In Docker Compose, all dependent services use condition: service_completed_successfully. Re-running on an already-initialized database is a no-op (all DDL uses IF NOT EXISTS).

Extractor

The extractor runs as two Docker services: extractor-discogs and extractor-musicbrainz.

# Required (Discogs mode)
DISCOGS_ROOT="/discogs-data"

# Required (MusicBrainz mode)
MUSICBRAINZ_ROOT="/musicbrainz-data"

# RabbitMQ
RABBITMQ_HOST=rabbitmq           # default: rabbitmq
RABBITMQ_USERNAME=discogsography # default: discogsography
RABBITMQ_PASSWORD=discogsography # default: discogsography

# Optional
PERIODIC_CHECK_DAYS=5            # Days between update checks (code default: 15, docker-compose: 5 Discogs / 3 MusicBrainz)
BATCH_SIZE=100                   # Records per AMQP publish batch (default: 100)
MAX_WORKERS=4                    # Concurrent processing workers (default: 4)
FORCE_REPROCESS=false            # Force reprocess even if already extracted (default: false)
LOG_LEVEL=INFO

Health check: http://localhost:8000/health (each extractor container exposes port 8000 internally)

Graphinator

# Required
NEO4J_HOST="localhost"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="discogsography"

# RabbitMQ
RABBITMQ_HOST=rabbitmq           # default: rabbitmq
RABBITMQ_USERNAME=discogsography # default: discogsography
RABBITMQ_PASSWORD=discogsography # default: discogsography

# Optional - Consumer Management
CONSUMER_CANCEL_DELAY=300
QUEUE_CHECK_INTERVAL=3600
STUCK_CHECK_INTERVAL=30    # Seconds between stuck-state checks

# Optional - Batch Processing (enabled by default)
NEO4J_BATCH_MODE=true
NEO4J_BATCH_SIZE=500
NEO4J_BATCH_FLUSH_INTERVAL=2.0

# Optional - Startup
STARTUP_DELAY=15                 # Seconds to wait before starting (default: 15)

# Optional - Logging
LOG_LEVEL=INFO

Health check: http://localhost:8001/health

Tableinator

# Required
POSTGRES_HOST="localhost"
POSTGRES_USERNAME="discogsography"
POSTGRES_PASSWORD="discogsography"
POSTGRES_DATABASE="discogsography"

# RabbitMQ
RABBITMQ_HOST=rabbitmq           # default: rabbitmq
RABBITMQ_USERNAME=discogsography # default: discogsography
RABBITMQ_PASSWORD=discogsography # default: discogsography

# Optional - Consumer Management
CONSUMER_CANCEL_DELAY=300
QUEUE_CHECK_INTERVAL=3600
STUCK_CHECK_INTERVAL=30    # Seconds between stuck-state checks

# Optional - Batch Processing (enabled by default)
POSTGRES_BATCH_MODE=true
POSTGRES_BATCH_SIZE=500
POSTGRES_BATCH_FLUSH_INTERVAL=2.0

# Optional - Startup
STARTUP_DELAY=20                 # Seconds to wait before starting (default: 20)

# Optional - Logging
LOG_LEVEL=INFO

Health check: http://localhost:8002/health

Explore

# Required
API_BASE_URL="http://api:8004"   # URL of the API service to proxy requests to

# Optional
CORS_ORIGINS="http://localhost:3000,http://localhost:8003"  # comma-separated origins
LOG_LEVEL=INFO

Health check: http://localhost:8007/health

Dashboard

# Required
NEO4J_HOST="localhost"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="discogsography"
POSTGRES_HOST="localhost"
POSTGRES_USERNAME="discogsography"
POSTGRES_PASSWORD="discogsography"
POSTGRES_DATABASE="discogsography"
REDIS_HOST="localhost"

# RabbitMQ
RABBITMQ_HOST=rabbitmq           # default: rabbitmq
RABBITMQ_USERNAME=discogsography # default: discogsography
RABBITMQ_PASSWORD=discogsography # default: discogsography

# Optional - RabbitMQ Management API access
RABBITMQ_MANAGEMENT_USER=discogsography
RABBITMQ_MANAGEMENT_PASSWORD=discogsography

# Optional - CORS
CORS_ORIGINS="http://localhost:8003,http://localhost:8006"  # comma-separated origins

# Optional - Cache
CACHE_WARMING_ENABLED=true         # Pre-warm cache on startup
CACHE_WEBHOOK_SECRET=              # Secret for cache invalidation webhooks

# Optional - Logging
LOG_LEVEL=INFO

Health check: http://localhost:8003/health

Insights

# Required
API_BASE_URL="http://api:8004"       # URL of the API service (fetches raw query data over HTTP)
POSTGRES_HOST="localhost"
POSTGRES_USERNAME="discogsography"
POSTGRES_PASSWORD="discogsography"
POSTGRES_DATABASE="discogsography"

# Optional - Redis Caching
REDIS_HOST="localhost"                        # Redis hostname for result caching (default: localhost)

# Optional - Scheduler
INSIGHTS_SCHEDULE_HOURS=24                    # Computation interval in hours (default: 24)

# Optional - Milestones
INSIGHTS_MILESTONE_YEARS="25,30,40,50,75,100" # Anniversary years to highlight (default: 25,30,40,50,75,100)

# Optional - Startup
STARTUP_DELAY=10                              # Seconds to wait before starting (default: 10)

# Optional - Logging
LOG_LEVEL=INFO

Health check: http://localhost:8009/health

Notes: The Insights service uses Redis for caching computed results (cache-aside pattern). The cache TTL matches the INSIGHTS_SCHEDULE_HOURS interval and is invalidated after each computation run. If Redis is unavailable, the service operates without caching.

Brainzgraphinator

# Required
NEO4J_HOST="localhost"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="discogsography"

# RabbitMQ
RABBITMQ_HOST=rabbitmq           # default: rabbitmq
RABBITMQ_USERNAME=discogsography # default: discogsography
RABBITMQ_PASSWORD=discogsography # default: discogsography

# Optional - Consumer Management
CONSUMER_CANCEL_DELAY=300
QUEUE_CHECK_INTERVAL=3600

# Optional - Batch Processing (enabled by default)
NEO4J_BATCH_MODE=true
NEO4J_BATCH_SIZE=500
NEO4J_BATCH_FLUSH_INTERVAL=2.0

# Optional - Startup
STARTUP_DELAY=20                 # Seconds to wait before starting (default: 20)

# Optional - Logging
LOG_LEVEL=INFO

Health check: http://localhost:8011/health

Notes: Brainzgraphinator enriches existing Neo4j nodes with MusicBrainz metadata (properties, relationships, cross-references). It skips entities without Discogs matches. Consumes from the musicbrainz-{artists,labels,release-groups,releases} exchanges.

Brainztableinator

# Required
POSTGRES_HOST="localhost"
POSTGRES_USERNAME="discogsography"
POSTGRES_PASSWORD="discogsography"
POSTGRES_DATABASE="discogsography"

# RabbitMQ
RABBITMQ_HOST=rabbitmq           # default: rabbitmq
RABBITMQ_USERNAME=discogsography # default: discogsography
RABBITMQ_PASSWORD=discogsography # default: discogsography

# Optional - Consumer Management
CONSUMER_CANCEL_DELAY=300
QUEUE_CHECK_INTERVAL=3600

# Optional - Batch Processing (enabled by default)
POSTGRES_BATCH_MODE=true
POSTGRES_BATCH_SIZE=500
POSTGRES_BATCH_FLUSH_INTERVAL=2.0

# Optional - Startup
STARTUP_DELAY=25                 # Seconds to wait before starting (default: 25)

# Optional - Logging
LOG_LEVEL=INFO

Health check: http://localhost:8010/health

Notes: Brainztableinator stores all MusicBrainz data in the musicbrainz PostgreSQL schema β€” including entities without Discogs matches β€” with relationships and external links. Consumes from the musicbrainz-{artists,labels,release-groups,releases} exchanges.

MCP Server

# Required
API_BASE_URL="http://api:8004"   # Base URL for the Discogsography API

Notes: The MCP server has no direct database dependencies β€” all data is fetched via the API service over HTTP. It supports stdio (default, for local use with Claude Desktop/Cursor/Zed) and streamable-http (for hosted deployments) transports.

Environment Templates

Development (.env.development)

# RabbitMQ (built from components)
RABBITMQ_HOST=localhost
RABBITMQ_USERNAME=discogsography
RABBITMQ_PASSWORD=discogsography

# Neo4j
NEO4J_HOST=localhost
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=development

# PostgreSQL
POSTGRES_HOST=localhost
POSTGRES_USERNAME=postgres
POSTGRES_PASSWORD=development
POSTGRES_DATABASE=discogsography_dev

# Redis
REDIS_HOST=localhost

# JWT (API)
JWT_SECRET_KEY=dev-secret-key-not-for-production
JWT_EXPIRE_MINUTES=1440
DISCOGS_USER_AGENT="Discogsography/1.0-dev +https://github.com/SimplicityGuy/discogsography"

# Data
DISCOGS_ROOT=/tmp/discogs-data-dev
PERIODIC_CHECK_DAYS=0

# Logging
LOG_LEVEL=DEBUG

# Consumer Management (aggressive for testing)
CONSUMER_CANCEL_DELAY=60
QUEUE_CHECK_INTERVAL=300

Production Secrets

In production, all sensitive credentials are delivered as Docker Compose runtime secrets β€” never as plain environment variables. The docker-compose.prod.yml overlay wires everything up automatically.

Step 1 β€” Bootstrap secrets (run once; safe to re-run, skips existing files):

bash scripts/create-secrets.sh

This creates secrets/ with these files (all chmod 600, directory chmod 700):

secrets/
β”œβ”€β”€ brevo_api_key.txt         # Optional β€” Brevo API key
β”œβ”€β”€ encryption_master_key.txt # base64(urandom(32))
β”œβ”€β”€ jwt_secret_key.txt        # openssl rand -hex 32
β”œβ”€β”€ neo4j_password.txt        # openssl rand -base64 24
β”œβ”€β”€ postgres_password.txt     # openssl rand -base64 24
β”œβ”€β”€ postgres_username.txt         # discogsography
β”œβ”€β”€ rabbitmq_password.txt     # openssl rand -base64 24
└── rabbitmq_username.txt         # discogsography

See secrets.example/ for reference placeholders and generation commands.

Step 2 β€” Set non-secret production environment (safe to commit, no credentials):

# RabbitMQ (hostname only β€” credentials come from Docker secrets)
RABBITMQ_HOST=rabbitmq.prod.internal

# Neo4j
NEO4J_HOST=neo4j.prod.internal

# PostgreSQL
POSTGRES_HOST=postgres.prod.internal
POSTGRES_DATABASE=discogsography

# Redis
REDIS_HOST=redis

# JWT (optional non-secret settings)
JWT_EXPIRE_MINUTES=1440
DISCOGS_USER_AGENT="Discogsography/1.0 +https://github.com/SimplicityGuy/discogsography"

# Data
DISCOGS_ROOT=/mnt/data/discogs
PERIODIC_CHECK_DAYS=15

# Logging
LOG_LEVEL=INFO

# Consumer Management
CONSUMER_CANCEL_DELAY=300
QUEUE_CHECK_INTERVAL=3600

Step 3 β€” Start with the production overlay:

docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d

Credentials are mounted at /run/secrets/<name> inside each container and read automatically. See Docker Security for the full secrets table and Neo4j entrypoint details.

Security Best Practices

Password Management

Never commit credentials:

# ❌ BAD β€” hardcoded password in any committed file
NEO4J_PASSWORD=supersecret

# βœ… GOOD β€” Docker Compose runtime secret (production)
# secrets/neo4j_password.txt contains the value, mounted at /run/secrets/neo4j_password
# docker-compose.prod.yml wires it up automatically via get_secret()

For production deployments, use docker-compose.prod.yml with scripts/create-secrets.sh. For other platforms:

  • Kubernetes: Kubernetes Secrets or an external secrets operator
  • HashiCorp Vault: Vault Agent Injector or the Vault CSI provider
  • AWS: Secrets Manager with the AWS Secrets and Configuration Provider
  • Azure: Azure Key Vault with the CSI Secrets Store driver

Connection Encryption

Enable encryption for production:

# Neo4j with TLS (bolt+s:// scheme constructed in code from hostname)
NEO4J_HOST=neo4j.example.com

# PostgreSQL hostname
POSTGRES_HOST=postgres.example.com

# Redis with TLS
REDIS_HOST=rediss://:password@redis.example.com:6380/0

Access Control

Use least-privilege principles:

# ❌ BAD - using admin credentials
NEO4J_USERNAME=admin
POSTGRES_USERNAME=postgres

# βœ… GOOD - dedicated service accounts
NEO4J_USERNAME=discogsography_app
POSTGRES_USERNAME=discogsography_app

Validation and Testing

Health Checks

Verify all services are configured correctly:

# Check all health endpoints
curl http://localhost:8000/health  # Extractor
curl http://localhost:8001/health  # Graphinator
curl http://localhost:8002/health  # Tableinator
curl http://localhost:8003/health  # Dashboard
curl http://localhost:8005/health  # API (health check port)
curl http://localhost:8007/health  # Explore
curl http://localhost:8009/health  # Insights
curl http://localhost:8010/health  # Brainztableinator
curl http://localhost:8011/health  # Brainzgraphinator

Expected response for all:

{"status": "healthy"}

Troubleshooting

Common Configuration Issues

Connection Refused Errors:

  • Check host and port are correct
  • Verify service is running
  • Check firewall rules
  • Wait for service startup (databases can take 30-60s)

Authentication Failures:

  • Verify username and password
  • Check password special characters are properly escaped
  • Ensure credentials match database configuration

Cache Directory Errors:

  • Verify directory exists
  • Check write permissions (UID 1000 for Docker)
  • Ensure sufficient disk space

Environment Variables Not Loading:

  • Check .env file is in correct location
  • Verify no syntax errors in .env
  • Restart services after changes
  • For Docker: rebuild images if needed

See Troubleshooting Guide for more solutions.

Related Documentation


Last Updated: 2026-04-03