This project focuses on building a pipeline to collect Google Street View panoramas and training a Vision Transformer (ViT) AI to predict the hemisphere (Northern/Southern) from images. This phase lays the groundwork for future geographic recognition models.
- Automated Data Collection: Fetches and labels panoramas using Google Street View API.
- Hemisphere Prediction: ViT-B/16 model trained to classify panoramas into Northern/Southern hemispheres.
- Scalable Storage: Images stored in Google Cloud Storage, metadata in PostgreSQL.
- Python 3.10+
- PostgreSQL
- Google Cloud account with:
- Street View Static API enabled
- Geocoding API enabled
- Google Cloud Storage
- Clone the repository:
git clone https://github.com/Apsurt/apsurt-omni-geo-ai.git cd apsurt-omni-geo-ai - Install dependencies:
pip install -r requirements.txt
- Configure environment variables:
cp .env.example .env # Add your Google API key and PostgreSQL credentials to .env - Initialize the database:
# Create PostgreSQL database createdb omni_geo_ai # Run the data collection script to initialize tables python scripts/fetch_panoramas.py --num_images 5 --dry_run
Run the panorama scraper to collect a specific number of images:
python scripts/fetch_panoramas.py --num_images 1000Advanced options:
# Control the ratio of Northern/Southern hemisphere images
python scripts/fetch_panoramas.py --num_images 100 --north_ratio 0.7
# Save collection statistics to a JSON file
python scripts/fetch_panoramas.py --num_images 50 --output_stats stats.json
# Enable verbose logging
python scripts/fetch_panoramas.py --num_images 20 --verbose
# Clear existing data before collecting new images
python scripts/fetch_panoramas.py --num_images 50 --clear
# Only clear data without collecting new images
python scripts/fetch_panoramas.py --clear_only
# Update the StreetViewClient with ~1000 cities
python scripts/update_streetview_cities.pyBackup the PostgreSQL database to Google Cloud Storage:
python scripts/backup_db.pyBackup options:
# Specify a different GCS bucket
python scripts/backup_db.py --bucket my-backup-bucket
# Custom backup directory path in GCS
python scripts/backup_db.py --backup_dir database/dailypython train.py --model vit_hemisphere --epochs 50MIT License. See LICENSE.
See ROADMAP.md for phase details.