Skip to content

Latest commit

Β 

History

History
723 lines (498 loc) Β· 13.5 KB

File metadata and controls

723 lines (498 loc) Β· 13.5 KB

πŸ”§ Troubleshooting Guide

Common issues and solutions for Discogsography

🏠 Back to Main | πŸ“š Documentation Index | πŸ“Š Monitoring

Overview

This guide covers common issues you might encounter while using Discogsography and provides step-by-step solutions. For real-time monitoring and debugging tools, see the Monitoring Guide.

🚨 Common Issues & Solutions

❌ Extractor Download Failures

Symptoms:

  • Extractor fails to download data files
  • Connection timeout errors
  • Disk space errors
  • Permission denied errors

Diagnostic Steps:

# Check connectivity to Discogs S3
curl -I https://discogs-data-dumps.s3.us-west-2.amazonaws.com

# Verify disk space (need 76GB+)
df -h /discogs-data

# Check permissions
ls -la /discogs-data

# View extractor logs
docker-compose logs -f extractor-discogs extractor-musicbrainz

Solutions:

  1. βœ… Ensure internet connectivity

    # Test connection
    ping discogs-data-dumps.s3.us-west-2.amazonaws.com
  2. βœ… Verify 76GB+ free space

    # Check available space
    df -h /discogs-data
    
    # Clean up if needed
    docker system prune -a --volumes
  3. βœ… Check directory permissions

    # Fix permissions (Docker needs write access)
    sudo chown -R 1000:1000 /discogs-data
    chmod -R 755 /discogs-data
  4. βœ… Verify environment variables

    # Check DISCOGS_ROOT is set correctly
    echo $DISCOGS_ROOT
    
    # Should point to writable directory

❌ RabbitMQ Connection Issues

Symptoms:

  • Services can't connect to RabbitMQ
  • "Connection refused" errors
  • "Authentication failed" errors

Diagnostic Steps:

# Check RabbitMQ status
docker-compose ps rabbitmq
docker-compose logs rabbitmq

# Test connection
curl -u discogsography:discogsography http://localhost:15672/api/overview

# Check if port is accessible
netstat -an | grep 5672

Solutions:

  1. βœ… Wait for RabbitMQ startup (30-60s)

    # RabbitMQ takes time to start
    docker-compose logs -f rabbitmq | grep "started"
  2. βœ… Check firewall settings

    # Ensure ports 5672 and 15672 are not blocked
    # macOS/Linux
    sudo ufw status
  3. βœ… Verify credentials in .env

    # Check RabbitMQ connection variables
    echo $RABBITMQ_HOST
    echo $RABBITMQ_USERNAME
    echo $RABBITMQ_PASSWORD
    
    # Should match RabbitMQ configuration
  4. βœ… Restart RabbitMQ

    docker-compose restart rabbitmq
    docker-compose logs -f rabbitmq

❌ Database Connection Errors

Neo4j Connection Issues

Symptoms:

  • "Failed to connect to Neo4j" errors
  • "Authentication failed" errors
  • Timeout errors

Diagnostic Steps:

# Check Neo4j status
docker-compose logs neo4j

# Test HTTP access
curl http://localhost:7474

# Test bolt connection
echo "MATCH (n) RETURN count(n);" | \
  cypher-shell -u neo4j -p discogsography

Solutions:

  1. βœ… Wait for Neo4j startup (30-60s)

    docker-compose logs -f neo4j | grep "Started"
  2. βœ… Verify credentials

    # Check environment variables
    echo $NEO4J_HOST
    echo $NEO4J_USERNAME
    echo $NEO4J_PASSWORD
  3. βœ… Check connection string

    # Should be bolt://host:7687
    # For Docker: bolt://neo4j:7687
    # For local: bolt://localhost:7687
  4. βœ… Restart Neo4j

    docker-compose restart neo4j

PostgreSQL Connection Issues

Symptoms:

  • "Could not connect to PostgreSQL" errors
  • "Authentication failed" errors
  • Connection timeout errors

Diagnostic Steps:

# Check PostgreSQL status
docker-compose logs postgres

# Test connection
PGPASSWORD=discogsography psql \
  -h localhost -p 5433 -U discogsography \
  -d discogsography -c "SELECT 1;"

Solutions:

  1. βœ… Wait for PostgreSQL startup

    docker-compose logs -f postgres | grep "ready"
  2. βœ… Verify credentials

    echo $POSTGRES_HOST
    echo $POSTGRES_USERNAME
    echo $POSTGRES_PASSWORD
    echo $POSTGRES_DATABASE
  3. βœ… Check port mapping

    # Default: 5433 (host) maps to 5432 (container)
    docker-compose ps postgres
  4. βœ… Restart PostgreSQL

    docker-compose restart postgres

❌ Port Conflicts

Symptoms:

  • "Port already in use" errors
  • Services fail to start
  • "Address already in use" errors

Diagnostic Steps:

# Check what's using the ports
netstat -an | grep -E "(5672|7474|7687|5433|6379|8003|8004|8005)"

# Or on macOS
lsof -i :8004
lsof -i :7474

# List all Docker containers
docker ps -a

Solutions:

  1. βœ… Stop conflicting services

    # Find process using port
    lsof -i :8004
    
    # Kill the process
    kill -9 <PID>
  2. βœ… Change port mapping

    # Edit docker-compose.yml
    ports:
      - "8006:8005"  # Use 8006 on host instead
  3. βœ… Stop all Docker containers

    docker-compose down
    docker-compose up -d

❌ Out of Memory / Disk Space

Symptoms:

  • Containers crash or are killed
  • "No space left on device" errors
  • Docker build failures

Diagnostic Steps:

# Check available disk space
df -h

# Check Docker disk usage
docker system df

# Check container resource usage
docker stats

Solutions:

  1. βœ… Increase Docker memory limits

    • Open Docker Desktop β†’ Settings β†’ Resources
    • Increase memory allocation (recommend 16GB+ for full dataset)
    • Restart Docker
  2. βœ… Clean up Docker resources

    # Remove unused containers
    docker container prune
    
    # Remove unused images
    docker image prune -a
    
    # Remove unused volumes
    docker volume prune
    
    # Remove everything unused
    docker system prune -a --volumes
  3. βœ… Free up disk space

    # Find large files
    du -sh /path/to/data/* | sort -hr | head -10
    
    # Remove old logs
    find /var/log -name "*.log" -mtime +7 -delete

❌ Permission Denied Errors

Symptoms:

  • Cannot write to volumes or log files
  • "Permission denied" errors in logs
  • Services fail to start with permission errors

Diagnostic Steps:

# Check file permissions
ls -la /discogs-data
ls -la logs/

# Check Docker user
docker run --rm alpine id

Solutions:

# Fix permissions on host directories
sudo chown -R 1000:1000 /discogs-data
sudo chown -R 1000:1000 logs/
chmod -R 755 /discogs-data
chmod -R 755 logs/

πŸ› Debugging Guide

Step 1: Check Service Health

All services expose health endpoints:

# Check each service
curl http://localhost:8000/health  # Extractor
curl http://localhost:8001/health  # Graphinator
curl http://localhost:8002/health  # Tableinator
curl http://localhost:8003/health  # Dashboard
curl http://localhost:8005/health  # API (health check port)
curl http://localhost:8009/health  # Insights
curl http://localhost:8010/health  # Brainztableinator
curl http://localhost:8011/health  # Brainzgraphinator
curl http://localhost:8007/health  # Explore (health check port)

Expected response:

{"status": "healthy"}

If unhealthy:

# View service logs
docker-compose logs [service_name]

# Restart service
docker-compose restart [service_name]

Step 2: Enable Debug Logging

Set LOG_LEVEL environment variable for detailed output:

# Set environment variable
export LOG_LEVEL=DEBUG

# Restart services
docker-compose down
docker-compose up -d

# Or for specific service
LOG_LEVEL=DEBUG uv run python -m explore.explore

DEBUG level includes:

  • πŸ” Database query logging with parameters
  • πŸ“Š Detailed operation traces
  • πŸ”„ Cache hits/misses
  • πŸ“‘ Internal state changes

See Logging Guide for complete details.

Step 3: Monitor Real-time Logs

# All services
docker-compose logs -f

# Specific service with timestamp
docker-compose logs -f --timestamps graphinator

# Filter for errors
docker-compose logs -f | grep -E "(ERROR|❌)"

# Filter for Neo4j queries (DEBUG mode β€” queries are handled by the API service)
docker-compose logs -f api | grep "πŸ” Executing Neo4j query"

Step 4: Check Queue Status

# RabbitMQ management UI
open http://localhost:15672

# Or use CLI monitoring
just monitor

# Or API
curl -u discogsography:discogsography \
  http://localhost:15672/api/queues

Look for:

  • Messages accumulating (consumers not keeping up)
  • Zero consumers (service not connected)
  • High unacked count (processing errors)

Step 5: Verify Database Connectivity

Neo4j:

# Browser access
curl http://localhost:7474

# Query test
echo "MATCH (n) RETURN count(n) as total;" | \
  cypher-shell -u neo4j -p discogsography

PostgreSQL:

# Connection test
PGPASSWORD=discogsography psql \
  -h localhost -p 5433 -U discogsography \
  -d discogsography -c "SELECT 1;"

# Check record counts
PGPASSWORD=discogsography psql \
  -h localhost -p 5433 -U discogsography \
  -d discogsography \
  -c "SELECT 'artists' as table, COUNT(*) FROM artists \
      UNION ALL SELECT 'releases', COUNT(*) FROM releases;"

Step 6: Verify Data Storage

Neo4j - Check node counts:

MATCH (n)
RETURN labels(n)[0] as type, count(n) as count
ORDER BY count DESC;

PostgreSQL - Check table counts:

SELECT 'artists' as table_name, COUNT(*) FROM artists
UNION ALL
SELECT 'releases', COUNT(*) FROM releases
UNION ALL
SELECT 'labels', COUNT(*) FROM labels
UNION ALL
SELECT 'masters', COUNT(*) FROM masters;

Expected counts (full dataset):

  • Artists: ~10 million
  • Releases: ~19 million
  • Labels: ~2.3 million
  • Masters: ~2.5 million

πŸ” Service-Specific Issues

Neo4j Schema Warnings

Symptoms: You see warning messages in the Graphinator service logs like:

{"event":"Received notification from DBMS server: {severity: WARNING} {code: Neo.ClientNotification.Statement.UnknownRelationshipTypeWarning} ...","level":"warning",...}

Warnings about:

  • Unknown relationship types: BY, IS
  • Unknown labels: Genre, Style
  • Unknown properties: profile

Cause: These warnings appear when:

  1. The Neo4j database is empty (no data has been loaded yet)
  2. The database is being populated by the Graphinator service
  3. A service queries data that does not exist yet

This is normal and not an error! The Cypher queries use OPTIONAL MATCH patterns that gracefully handle missing data.

Solution 1: Suppress the warnings (Recommended)

The warnings are already suppressed in the codebase by configuring the logging level:

# In common/config.py setup_logging()
logging.getLogger("neo4j.notifications").setLevel(logging.ERROR)
logging.getLogger("neo4j").setLevel(logging.ERROR)

Solution 2: Populate the database

Run the extractor and graphinator to load data:

docker-compose up -d extractor-discogs
docker-compose logs -f extractor-discogs

docker-compose up -d graphinator
docker-compose logs -f graphinator

# Verify data in Neo4j
curl http://localhost:7474

Dashboard Issues

WebSocket Connection Failures

Symptom:

  • Dashboard shows "Disconnected" status
  • Real-time updates not working
  • Browser console shows WebSocket errors

Solution:

# Check dashboard is running
curl http://localhost:8003/health

# Restart dashboard
docker-compose restart dashboard

# Check browser console for errors
# F12 β†’ Console tab

Stale Data Display

Symptom:

  • Dashboard shows old data
  • Metrics don't update

Solution:

# Clear Redis cache
docker-compose exec redis redis-cli FLUSHDB

# Restart dashboard
docker-compose restart dashboard

# Refresh browser (Cmd+Shift+R / Ctrl+Shift+F5)

Extractor Issues

Stuck on "Checking for updates"

Symptom:

  • Extractor logs show "πŸ” Checking for updates..." repeatedly
  • No download progress
  • Runs indefinitely

Solution:

# Check network connectivity
curl -I https://discogs-data-dumps.s3.us-west-2.amazonaws.com

# Restart extractor
docker-compose restart extractor-discogs  # or extractor-musicbrainz

# Check logs
docker-compose logs -f extractor-discogs extractor-musicbrainz

Slow Download Speed

Symptom:

  • Download takes very long
  • Slow progress messages
  • Low MB/s rate

Solutions:

  1. Check network speed

    # Test download speed
    speedtest-cli
  2. Resume interrupted download

    • Extractor automatically resumes from last position
    • Check for partial .xml.gz files in /discogs-data

πŸ“Š Performance Issues

Slow Query Performance

Symptoms:

  • Queries take too long
  • Dashboard slow to load
  • Explore service timeouts

Diagnostic Steps:

Neo4j:

-- Profile slow query
PROFILE MATCH (a:Artist {name: "Pink Floyd"})-[:BY]-(r:Release)
RETURN r.title, r.year;

-- Check index usage
SHOW INDEXES;

PostgreSQL:

-- Analyze query performance
EXPLAIN ANALYZE
SELECT data FROM artists WHERE data->>'name' = 'Pink Floyd';

-- Check index usage
SELECT * FROM pg_stat_user_indexes
ORDER BY idx_scan DESC;

Solutions:

  1. Add missing indexes (see Database Schema)
  2. Run VACUUM ANALYZE on PostgreSQL
  3. Increase database memory (see Configuration)
  4. Enable query caching in Redis

High Memory Usage

Symptoms:

  • Services using excessive RAM
  • OOM (Out of Memory) kills
  • System slowdown

Solutions:

# Check resource usage
docker stats

# Limit service memory in docker-compose.yml
services: