Skip to content

CIAT-DAPA/aclimate_v3_historical_spatial_etl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AClimate V3 Historical Spatial ETL πŸŒπŸ“Š

🏷️ Version & Tags

GitHub release (latest by date)

Tags: climate-data, etl, geoprocessing, python, geoserver, chirps, copernicus, AGera5


πŸ“Œ Introduction

Python package for processing spatial historical climate data with a complete ETL pipeline that includes:

  • Data download from CHIRPS and Copernicus sources
  • Spatial clipping by country boundaries
  • Monthly aggregation and climatology calculations
  • Climate indicators calculation (TR20, TXx, etc.)
  • GeoServer integration for data publishing

Key Features:

  • Automated processing of temperature, precipitation, and solar radiation data
  • Parallel processing for downloads and indicator calculations
  • Flexible indicator calculation with custom year ranges
  • Flexible configuration for multiple countries and variables
  • End-to-end pipeline from raw data to published layers
  • Database-backed configuration management

βœ… Prerequisites

  • Python > 3.10
  • Copernicus Climate Data Store (CDS) API key - Register here
  • GeoServer
  • PostgreSQL database for configuration storage

βš™οΈ Installation

pip install git+https://github.com/CIAT-DAPA/aclimate_v3_cut_spatial_data.git
pip install git+https://github.com/CIAT-DAPA/aclimate_v3_spatial_importer.git
pip install git+https://github.com/CIAT-DAPA/aclimate_v3_orm
pip install git+https://github.com/CIAT-DAPA/aclimate_v3_historical_spatial_etl

To install a specific version:

pip install git+https://github.com/CIAT-DAPA/[email protected]

πŸš€ Basic Usage

1. Command Line Interface

Standard Pipeline

python -m aclimate_v3_historical_spatial_etl.aclimate_run_etl \
  --country HONDURAS \
  --start_date 2020-01 \
  --end_date 2020-12 \
  --data_path /path/to/data \
  --climatology

Climate Indicators Calculation

python -m aclimate_v3_historical_spatial_etl.aclimate_run_etl \
  --country HONDURAS \
  --start_date 2025-01 \
  --end_date 2025-01 \
  --data_path /path/to/data \
  --indicators \
  --indicator_years 2015-2020

Skip Processing (Indicators Only)

python -m aclimate_v3_historical_spatial_etl.aclimate_run_etl \
  --country COLOMBIA \
  --start_date 2025-01 \
  --end_date 2025-01 \
  --data_path /path/to/data \
  --skip_download \
  --skip_processing \
  --indicators \
  --indicator_years 2010-2020

Note

New Options:

  • --skip_download: Skip the data download step
  • --skip_processing: Skip data processing (clipping, monthly aggregation)
  • --climatology: Calculate monthly averages-climatology
  • --indicators: Calculate climate indicators
  • --indicator_years YYYY-YYYY: Specify year range for indicator calculation
  • --no_cleanup: Keep intermediate files after processing

2. Programmatic Usage

from aclimate_v3_historical_spatial_etl.aclimate_run_etl import run_etl_pipeline

run_etl_pipeline(
    country="HONDURAS",
    start_date="2020-01",
    end_date="2020-12",
    data_path="/path/to/data",
    climatology=True,
    indicators=True,
    indicator_years="2015-2020"
)

πŸ—‚οΈ Directory Structure (Auto-generated)

data/
β”œβ”€β”€ config/               # Must contain required JSON config files
β”œβ”€β”€ raw_data/             # Downloaded raw datasets
β”œβ”€β”€ process_data/         # Intermediate raster data
β”œβ”€β”€ calc_data/
β”‚   β”œβ”€β”€ climatology_data/ # Climatology outputs
β”‚   β”œβ”€β”€ monthly_data/     # Monthly processed rasters
β”‚   └── indicators_data/  # Climate indicators (TR20, TXx, etc.)
└── upload_geoserver/     # Output prepared for GeoServer publishing

🌑️ Climate Indicators

Supported Indicators

Indicator Name Description Unit
TR20 Tropical Days Annual count of days with Tmax > 20Β°C days
TXx Maximum Temperature Maximum Annual maximum of daily maximum temperature Β°C

Indicator Features

  • Parallel Processing: Multiple years calculated simultaneously
  • Flexible Year Ranges: Calculate indicators for any historical period
  • Raster Output: Results saved as GeoTIFF files with proper georeferencing
  • Multi-year Statistics: Automatic calculation of multi-year averages
  • Memory Efficient: Processes data in chunks to handle large datasets

πŸ“‹ Detailed Usage Guide: See INDICATORS_USAGE.md for comprehensive examples and configuration options.

πŸ”§ Configuration

Database Configuration

All configurations are stored in the database. Ensure your database contains the required configuration entries for:

  1. chirps_config - CHIRPS download settings
  2. copernicus_config - Copernicus/ERA5 settings
  3. clipping_config - Country boundaries and ISO codes
  4. geoserver_config - GeoServer workspace and store names
  5. naming_config - Output file naming conventions

Environment Variables

  • Windows:
# GeoServer credentials
set GEOSERVER_URL=http://localhost:8086/geoserver/rest/
set GEOSERVER_USER=admin
set GEOSERVER_PASSWORD=password
set OTLP_ENDPOINT=localhost:4317
set ENABLE_SIGNOZ=false
set LOG_FILE_PATH=path/application.log
set DATABASE_URL=postgresql://postgres:admin@localhost:5432/acimate_v3
  • Linux/Ubuntu:
# GeoServer credentials
export GEOSERVER_URL=http://localhost:8086/geoserver/rest/
export GEOSERVER_USER=admin
export GEOSERVER_PASSWORD=password
export OTLP_ENDPOINT=localhost:4317
export ENABLE_SIGNOZ=false
export LOG_FILE_PATH=path/application.log
export DATABASE_URL=postgresql://postgres:admin@localhost:5432/acimate_v3

Note

Options:

  • GEOSERVER_URL: Geoserver URL
  • GEOSERVER_USER: Geoserver user
  • GEOSERVER_PASSWORD: Geoserver password
  • OTLP_ENDPOINT: Signoz endpoint to send logs
  • ENABLE_SIGNOZ: Flag to send logs to signoz
  • LOG_FILE_PATH: Path to save logs
  • DATABASE_URL: Connection string to database

πŸ§ͺ Running Tests

# Install test requirements
pip install pytest pytest-mock

# Run tests
pytest tests/

πŸ”„ CI/CD Pipeline Overview

Workflow Architecture

Our GitHub Actions pipeline implements a three-stage deployment process:

Code Push β†’ Test Stage β†’ Merge Stage β†’ Release Stage

πŸ“Š Project Structure

aclimate_v3_historical_spatial_etl/
β”‚
β”œβ”€β”€ .github/
β”‚ └── workflows/ # CI/CD pipeline configurations
β”œβ”€β”€ src/
β”‚   └── aclimate_v3_historical_spatial_etl/
β”‚       β”œβ”€β”€ connectors/           # Downloaders: CHIRPS, Copernicus
β”‚       β”œβ”€β”€ tools/                # Clipping and GeoServer tools
β”‚       β”œβ”€β”€ climate_processing/   # Monthly and climatology processors
β”‚       β”œβ”€β”€ config/               # Example config files
β”‚       └── aclimate_run_etl.py   # Main ETL entry script
β”œβ”€β”€ tests/                       # Unit and integration tests
β”œβ”€β”€ requirements.txt             # Dependencies
└── pyproject.toml               # Packaging

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages