diff --git a/.env.example b/.env.example index 0cc3c45..1b2d34c 100644 --- a/.env.example +++ b/.env.example @@ -14,10 +14,8 @@ SNOWFLAKE_ROLE= SNOWFLAKE_WAREHOUSE= SNOWFLAKE_DATABASE= SNOWFLAKE_SCHEMA= -# Path to the private key file (.p8 or .pem) - used by dev/prod targets -SNOWFLAKE_PRIVATE_KEY_PATH=/path/to/your/snowflake_key.p8 -# Private key content (PEM format with newlines) - used by test target and CI -# SNOWFLAKE_PRIVATE_KEY= +# Path to the private key file (.p8 or .pem) - used by all targets and Airflow +SNOWFLAKE_PRIVATE_KEY_FILE=/path/to/your/snowflake_key.p8 # Kafka Configuration KAFKA_BOOTSTRAP_SERVERS=localhost:29092 diff --git a/.gitignore b/.gitignore index 64eb382..76085e3 100644 --- a/.gitignore +++ b/.gitignore @@ -47,3 +47,7 @@ airflow/config/airflow.cfg airflow/__pycache__/ airflow/**/__pycache__/ .airflowignore + + +# SSH +.ssh/ \ No newline at end of file diff --git a/CLAUDE.md b/CLAUDE.md index d08bfcb..952c6ee 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -33,58 +33,55 @@ The project follows an event-driven ELT (Extract, Load, Transform) pattern with Apache Airflow orchestrates the entire pipeline with monitoring and error handling. **Key DAGs:** -- `stables_connection_test` - Infrastructure health checks -- `stables_kafka_consumer_dag` - Scheduled Kafka consumption (every 10 minutes) -- `stables_kafka_monitoring_dag` - Monitors Kafka health and consumer lag (planned) -- `stables_transform_dag` - Orchestrates dbt transformations (planned) -- `stables_master_pipeline_dag` - End-to-end pipeline orchestration (planned) +- `orchestrate_load_and_dbt` - Master orchestration DAG (daily at midnight UTC) - triggers log collection → dbt transformation +- `collect_and_load_logs_dag` - Incremental log collection from HyperSync → Snowflake (triggered by master DAG) +- `dbt_sf` - dbt transformation on Snowflake (triggered by master DAG) **Configuration:** - Airflow UI: http://localhost:8080 (airflow/airflow) -- Connections managed via `scripts/setup_airflow_connections.sh` +- Connections managed via `scripts/setup_airflow_postgres_connection.sh` and `scripts/setup_airflow_snowflake_connection.sh` - Environment variables in `.env` -### 1. Streaming Layer (Kafka) -**Location**: `scripts/kafka/` +### 1. Data Extraction Layer +**Location**: `scripts/raw_data/` -Real-time data streaming from GraphQL indexer to PostgreSQL via Kafka. +HyperSync GraphQL API integration for blockchain log collection. -**Components:** -- `producer.py` - Fetches from GraphQL API and publishes to Kafka topic (runs continuously in Docker) -- `consumer.py` - Legacy standalone consumer (deprecated - use Airflow DAG instead) -- `consume_batch.py` - Airflow-compatible batch consumer function -- `data_generator.py` - GraphQL data fetching logic -- `database.py` - PostgreSQL batch insert operations -- `models.py` - Data models for transfer events +**Scripts:** +- `collect_logs_data.py` - One-time/historical log collection +- `collect_logs_incremental.py` - Incremental collection (auto-detects latest block from existing files) +- `get_events.py` - Extract event signatures and topic0 hashes from contract ABIs **Key Features:** -- Continuous streaming from GraphQL indexer via producer -- **Airflow-orchestrated consumption**: `stables_kafka_consumer_dag` runs every 10 minutes -- Batch insertion for efficiency (configurable batch size) -- Automatic topic creation -- Consumer offset management -- Target table: `raw.kafka_sink_raw_transfer` - -**Recommended Setup:** -```bash -# 1. Start producer (continuous streaming to Kafka) -docker-compose -f docker-compose.kafka-app.yml up -d producer - -# 2. Enable Airflow DAG for scheduled consumption (via Airflow UI) -# Enable: stables_kafka_consumer_dag (runs every 10 minutes) - -# 3. Monitor with Kafdrop UI -http://localhost:9000 -``` +- Auto-detection of latest fetched block to prevent duplicates +- Parquet output format for efficient storage +- Integrated into Airflow DAG (`collect_and_load_logs_dag`) for automated daily runs +- Environment variable: `HYPERSYNC_BEARER_TOKEN` required -**Legacy Standalone Consumer** (not recommended): +**Usage Examples:** ```bash -# Old approach - runs consumer as standalone Docker container -docker-compose -f docker-compose.kafka-app.yml up -d consumer +# One-time historical collection +uv run python scripts/raw_data/collect_logs_data.py \ + --contract 0x1234... \ + --from-block 22269355 \ + --to-block 24107494 \ + --contract-name open_logs + +# Incremental collection (auto-detects from_block) +uv run python scripts/raw_data/collect_logs_incremental.py \ + --contract 0x1234... \ + --prefix open_logs + +# Extract event signatures from ABI +uv run python scripts/raw_data/get_events.py \ + --chainid 1 \ + --contract 0x1234... \ + --name open_logs ``` -### 2. Load Layer (Legacy - `scripts/load_file.py`) -**Note**: This is the legacy batch loading approach. The project now uses Kafka streaming (see above). +See [scripts/raw_data/README.md](scripts/raw_data/README.md) for detailed documentation. + +### 2. Load Layer (`scripts/load_file.py`) Unified loader script supporting both PostgreSQL and Snowflake with pluggable database clients. @@ -93,8 +90,7 @@ Unified loader script supporting both PostgreSQL and Snowflake with pluggable da - Three write modes: `append`, `replace`, `merge` (upsert) - Uses DLT (data load tool) for efficient loading - Database clients in `scripts/utils/database_client.py`: `PostgresClient`, `SnowflakeClient` - -**Usage:** Primarily used for loading static reference data (e.g., stablecoin metadata CSV) +- Used by `collect_and_load_logs_dag` to load collected logs to Snowflake **Arguments:** - `-f`: File path (Parquet or CSV) @@ -150,21 +146,15 @@ cp .env.example .env # Edit .env with your credentials # Start infrastructure (modular approach) -docker-compose up -d # PostgreSQL +docker-compose -f docker-compose.postgres.yml up -d # PostgreSQL docker-compose -f docker-compose.kafka.yml up -d # Kafka + Kafdrop UI docker-compose -f docker-compose.airflow.yml up -d # Airflow -docker-compose -f docker-compose.kafka-app.yml up -d # Producer/Consumer - -# OR use unified compose (legacy) -docker-compose -f docker-compose-unified.yml up -d ``` **Docker Compose Files**: -- `docker-compose.yml` - Base PostgreSQL database -- `docker-compose.kafka.yml` - Kafka streaming infrastructure +- `docker-compose.postgres.yml` - Base PostgreSQL database +- `docker-compose.kafka.yml` - Kafka streaming infrastructure (includes Kafdrop UI) - `docker-compose.airflow.yml` - Airflow orchestration -- `docker-compose.kafka-app.yml` - Application services (producer/consumer) -- `docker-compose-unified.yml` - All-in-one (legacy, deprecated) ### Data Loading ```bash @@ -247,18 +237,22 @@ dbt docs serve Required variables (see [.env.example](.env.example)): - `POSTGRES_HOST`, `POSTGRES_PORT`, `POSTGRES_DB`, `POSTGRES_USER`, `POSTGRES_PASSWORD` - `DB_SCHEMA`: Default schema for operations -- `KAFKA_NETWORK_NAME`: Docker network name +- `HYPERSYNC_BEARER_TOKEN`: HyperSync API authentication token -Optional (for Snowflake): +For Snowflake (production): - `SNOWFLAKE_ACCOUNT`, `SNOWFLAKE_USER`, `SNOWFLAKE_ROLE`, `SNOWFLAKE_WAREHOUSE` - `SNOWFLAKE_DATABASE`, `SNOWFLAKE_SCHEMA` - `SNOWFLAKE_PRIVATE_KEY_PATH`: Path to private key file (.p8 or .pem) for local development +Optional: +- `ETHERSCAN_API_KEY`: For ABI extraction via `get_events.py` + ## Key Data Flows -1. **HyperSync GraphQL → Parquet**: Extract blockchain data using external indexer to `.data/raw/*.parquet` -2. **Parquet → PostgreSQL/Snowflake**: `load_file.py` loads into `raw` schema tables -3. **Raw → dbt**: dbt models transform raw data through staging → marts +1. **HyperSync GraphQL → Parquet**: Extract blockchain logs using `collect_logs_incremental.py` to `.data/*.parquet` +2. **Parquet → Snowflake**: `load_file.py` loads into `raw` schema tables (automated via `collect_and_load_logs_dag`) +3. **Raw → dbt → Marts**: dbt models transform raw data through staging → intermediate → mart layers (via `dbt_sf` DAG) +4. **Orchestration**: `orchestrate_load_and_dbt` DAG runs daily to chain collection → loading → transformation ## SCD Type 2: Stablecoin Metadata Tracking @@ -354,24 +348,11 @@ dbt_project/ ### Setup **1. Start Infrastructure:** - -Choose either modular (recommended) or unified approach: - -**Modular Approach** (recommended): ```bash # Start services separately for better control -docker-compose up -d # PostgreSQL -docker-compose -f docker-compose.kafka.yml up -d # Kafka -docker-compose -f docker-compose.airflow.yml up -d # Airflow - -# Wait for Airflow to initialize (~2 minutes) -docker logs -f stables-airflow-init -``` - -**Unified Approach** (legacy): -```bash -# Start all services at once -docker-compose -f docker-compose-unified.yml up -d +docker-compose -f docker-compose.postgres.yml up -d # PostgreSQL +docker-compose -f docker-compose.kafka.yml up -d # Kafka (optional) +docker-compose -f docker-compose.airflow.yml up -d # Airflow # Wait for Airflow to initialize (~2 minutes) docker logs -f stables-airflow-init @@ -379,61 +360,68 @@ docker logs -f stables-airflow-init **2. Configure Connections:** ```bash -# Setup Airflow connections (Postgres, Snowflake) -bash scripts/setup_airflow_connections.sh +# Setup Airflow connections for PostgreSQL and Snowflake +bash scripts/setup_airflow_postgres_connection.sh +bash scripts/setup_airflow_snowflake_connection.sh ``` -**3. Verify Setup:** +**3. Access Airflow UI:** ```bash -# Access Airflow UI +# Open Airflow UI open http://localhost:8080 # Login: airflow / airflow - -# Enable and trigger connection test DAG -# DAGs > stables_connection_test > Enable > Trigger ``` ### Running the Pipeline -**Recommended Workflow:** +**Production Workflow (Snowflake):** ```bash -# 1. Start Kafka producer (continuous streaming) -docker-compose -f docker-compose.kafka-app.yml up -d producer +# 1. Enable the master orchestration DAG +# In Airflow UI: Enable orchestrate_load_and_dbt -# 2. Enable Airflow DAGs in UI -# - stables_kafka_consumer_dag (runs every 10 minutes automatically) -# - stables_connection_test (run manually to verify setup) +# 2. The DAG runs daily at midnight UTC and automatically: +# a. Collects incremental logs from HyperSync +# b. Loads them to Snowflake (raw.open_logs) +# c. Runs dbt transformations -# 3. Monitor via Airflow UI and Kafdrop -# Airflow: http://localhost:8080 -# Kafdrop: http://localhost:9000 +# 3. Monitor via Airflow UI +# http://localhost:8080 ``` -**Scheduled Orchestration:** -- `stables_kafka_consumer_dag` runs every 10 minutes (consumes from Kafka → PostgreSQL) -- `stables_transform_dag` runs hourly/triggered (planned) -- `stables_master_pipeline_dag` runs daily at 2 AM (planned) +**Manual Trigger:** +```bash +# Trigger the master DAG manually via UI or CLI +docker exec stables-airflow-scheduler airflow dags trigger orchestrate_load_and_dbt +``` ### Key Airflow DAGs | DAG | Schedule | Purpose | Status | |-----|----------|---------|--------| -| `stables_connection_test` | Manual | Verify infrastructure connectivity | ✅ Active | -| `stables_kafka_consumer_dag` | Every 10 min | Consume from Kafka → PostgreSQL | ✅ Active | -| `stables_kafka_monitoring_dag` | Every 10 min | Monitor Kafka health and consumer lag | 📋 Planned | -| `stables_transform_dag` | Hourly/Triggered | Run dbt transformations | 📋 Planned | -| `stables_master_pipeline_dag` | Daily 2 AM | End-to-end pipeline orchestration | 📋 Planned | - -**Consumer DAG Details** (`stables_kafka_consumer_dag`): -- **Schedule**: Every 10 minutes -- **Max Duration**: 8 minutes per run (leaves 2-min buffer) -- **Batch Size**: 100 messages per database insert -- **Consumer Group**: `stables-airflow-consumer` (separate from standalone consumer) +| `orchestrate_load_and_dbt` | Daily (midnight UTC) | Master orchestration: log collection → dbt transformation | ✅ Active | +| `collect_and_load_logs_dag` | Triggered | Incremental log collection from HyperSync → Snowflake | ✅ Active | +| `dbt_sf` | Triggered | Run dbt transformations on Snowflake | ✅ Active | + +**Master DAG Details** (`orchestrate_load_and_dbt`): +- **Schedule**: Daily at midnight UTC +- **Tasks**: + 1. Trigger `collect_and_load_logs_dag` and wait for completion + 2. Trigger `dbt_sf` for transformations +- **Behavior**: Pipeline stops if log collection fails + +**Collection DAG Details** (`collect_and_load_logs_dag`): +- **Contract**: `0x323c03c48660fE31186fa82c289b0766d331Ce21` +- **Auto-detects** latest block from existing parquet files +- **Output**: `.data/open_logs_{date}_{from_block}_{to_block}.parquet` +- **Target**: Snowflake `raw.open_logs` table - **Tasks**: - 1. Check Kafka connection and topic - 2. Check PostgreSQL connection - 3. Consume batch from Kafka and load to `raw.kafka_sink_raw_transfer` - 4. Log statistics and verify data loaded + 1. Collect incremental logs from HyperSync + 2. Load to Snowflake using `load_file.py` + +**dbt DAG Details** (`dbt_sf`): +- **Command**: `dbt build` (runs models, tests, snapshots) +- **Target**: Snowflake warehouse +- **Environment**: Configured from Airflow `snowflake_default` connection ### Monitoring @@ -442,7 +430,7 @@ docker-compose -f docker-compose.kafka-app.yml up -d producer - Logs for debugging - Connection management -**Kafdrop (Kafka UI):** http://localhost:9000 +**Kafdrop (Kafka UI):** http://localhost:9000 (if Kafka is running) - Topic messages and partitions - Consumer group lag - Broker health @@ -452,6 +440,11 @@ docker-compose -f docker-compose.kafka-app.yml up -d producer psql postgresql://postgres:postgres@localhost:5432/postgres ``` +**Snowflake:** Monitor via Snowflake web UI +- Query history +- Warehouse usage +- Table statistics + ### Troubleshooting **Airflow won't start:** @@ -460,13 +453,9 @@ psql postgresql://postgres:postgres@localhost:5432/postgres docker logs stables-airflow-init docker logs stables-airflow-scheduler -# Reset Airflow database (modular) +# Reset Airflow database docker-compose -f docker-compose.airflow.yml down -v docker-compose -f docker-compose.airflow.yml up -d - -# OR reset with unified approach (legacy) -docker-compose -f docker-compose-unified.yml down -v -docker-compose -f docker-compose-unified.yml up -d ``` **DAG not appearing:** @@ -478,37 +467,49 @@ docker exec stables-airflow-scheduler airflow dags list-import-errors python airflow/dags/your_dag.py ``` -**Connection test fails:** +**DAG fails with import errors:** ```bash -# Verify services are running -docker ps | grep stables +# Check if required Python packages are installed in Airflow container +docker exec stables-airflow-scheduler pip list | grep hypersync +docker exec stables-airflow-scheduler pip list | grep polars -# Check network connectivity -docker exec stables-airflow-scheduler ping kafka-postgres -docker exec stables-airflow-scheduler ping kafka +# Rebuild Airflow image if needed (packages defined in airflow/requirements.txt) +docker-compose -f docker-compose.airflow.yml build --no-cache +docker-compose -f docker-compose.airflow.yml up -d +``` + +**Snowflake connection fails:** +```bash +# Test Snowflake connection +docker exec stables-airflow-scheduler airflow connections get snowflake_default + +# Verify private key is properly encoded (should be base64) +# Re-run setup script if needed +bash scripts/setup_airflow_snowflake_connection.sh ``` ## Project Structure - **airflow/**: Airflow DAGs and orchestration - - `dags/`: DAG definitions - - `plugins/`: Custom operators + - `dags/`: DAG definitions (`orchestrate_load_and_dbt.py`, `collect_load_logs.py`, `dbt_sf.py`) - `logs/`: Execution logs - - `config/`: Airflow configuration + - `requirements.txt`: Python dependencies for Airflow container - **scripts/**: Runnable Python scripts - - `kafka/`: Producer, consumer, and Kafka utilities - - `load_file.py`: Legacy batch loader - - `utils/database_client.py`: Database client abstractions - - `setup_airflow_connections.sh`: Airflow connection setup + - `raw_data/`: HyperSync log collection scripts + - `load_file.py`: Unified data loader (Parquet/CSV → PostgreSQL/Snowflake) + - `utils/database_client.py`: Database client abstractions (`PostgresClient`, `SnowflakeClient`) + - `setup_airflow_postgres_connection.sh`: PostgreSQL connection setup for Airflow + - `setup_airflow_snowflake_connection.sh`: Snowflake connection setup for Airflow - **dbt_project/**: dbt transformation layer -- **.data/raw/**: Data files (Parquet, CSV) + - `models/01_staging/`: Staging views + - `models/02_intermediate/`: Ephemeral intermediate models + - `models/03_mart/`: Final analytics tables + - `snapshots/`: SCD Type 2 snapshots + - `macros/`: Custom SQL macros +- **.data/**: Data files (Parquet, CSV) - created at runtime - **docs/**: MkDocs documentation - **Docker Compose files**: - - `docker-compose.yml`: Base PostgreSQL database - - `docker-compose.kafka.yml`: Kafka infrastructure + - `docker-compose.postgres.yml`: PostgreSQL database + - `docker-compose.kafka.yml`: Kafka infrastructure (optional) - `docker-compose.airflow.yml`: Airflow orchestration - - `docker-compose.kafka-app.yml`: Application services (producer/consumer) - - `docker-compose-unified.yml`: All-in-one (legacy, deprecated) -- Always run Python with: `uv run python script.py` - -See [DOCKER-COMPOSE-GUIDE.md](DOCKER-COMPOSE-GUIDE.md) for detailed Docker Compose usage. +- Always run Python scripts with: `uv run python script.py` diff --git a/Dockerfile.airflow b/Dockerfile.airflow index 4cc1988..551b3f4 100644 --- a/Dockerfile.airflow +++ b/Dockerfile.airflow @@ -8,12 +8,16 @@ WORKDIR /opt/airflow # Install Python dependencies # ----------------------------- -# Copy requirements.txt generated by uv -COPY requirements-airflow.txt ./ +# Copy requirements.txt from airflow directory +COPY airflow/requirements.txt ./ # Install dependencies from requirements.txt # This approach is simple, reliable, and uses uv-generated dependencies -RUN pip install --no-cache-dir -r requirements-airflow.txt +RUN pip install --no-cache-dir -r requirements.txt + +# Install dlt separately (with postgres and snowflake extras) +# Installed separately to avoid hash verification conflicts +RUN pip install --no-cache-dir "dlt[postgres,snowflake]>=1.17.0" # Add dbt to PATH globally so all tasks can find it ENV PATH="/home/airflow/.local/bin:${PATH}" diff --git a/airflow/dags/collect_load_logs.py b/airflow/dags/collect_load_logs.py new file mode 100644 index 0000000..7001ecb --- /dev/null +++ b/airflow/dags/collect_load_logs.py @@ -0,0 +1,317 @@ +""" +Airflow DAGs for incremental log collection and loading to Snowflake. + +This module contains two DAGs: +1. collect_logs_incremental_dag - Collects logs incrementally from HyperSync +2. load_logs_to_snowflake_dag - Loads collected parquet files to Snowflake +""" + +import os +import re +import asyncio +from datetime import datetime +from pathlib import Path + +import hypersync +import polars as pl +import pendulum +from airflow.decorators import dag, task +from dotenv import load_dotenv + +load_dotenv() + + +# ============================================================================ +# Utility Functions +# ============================================================================ + + +def get_latest_fetched_block( + prefix: str, data_dir: str = "/opt/airflow/.data" +) -> int | None: + """ + Get the latest (maximum) block number from parquet files matching a prefix. + + Args: + prefix: Prefix to search for (e.g., 'open_logs') + data_dir: Directory containing parquet files (default: '/opt/airflow/.data') + + Returns: + Maximum block number found, or None if no matching files exist + """ + import logging + + logger = logging.getLogger(__name__) + + data_path = Path(data_dir) + logger.info(f"Checking for existing files in: {data_dir}") + logger.info(f"Directory exists: {data_path.exists()}") + + if not data_path.exists(): + logger.warning(f"Data directory does not exist: {data_dir}") + return None + + # Find all parquet files matching the prefix + pattern = f"{prefix}_*.parquet" + matching_files = list(data_path.glob(pattern)) + logger.info(f"Found {len(matching_files)} files matching pattern '{pattern}'") + + if not matching_files: + logger.warning(f"No parquet files found with prefix: {prefix}") + # List all files in directory for debugging + all_files = list(data_path.glob("*.parquet")) + logger.info(f"All parquet files in directory: {[f.name for f in all_files]}") + return None + + # Extract block numbers from filenames + # Expected format: prefix_YYMMDD_fromBlock_toBlock.parquet (e.g., open_logs_260101_12345_67890.parquet) + max_block = None + block_pattern = re.compile(rf"{re.escape(prefix)}_\d{{6}}_(\d+)_(\d+)\.parquet") + + for file in matching_files: + match = block_pattern.match(file.name) + if match: + from_block = int(match.group(1)) + to_block = int(match.group(2)) + + # Track the maximum to_block + if max_block is None or to_block > max_block: + max_block = to_block + + logger.info(f"Found file: {file.name} (blocks {from_block}-{to_block})") + else: + logger.warning(f"File does not match expected pattern: {file.name}") + + if max_block: + logger.info(f"Latest fetched block for '{prefix}': {max_block}") + else: + logger.warning( + f"No valid block numbers found in filenames for prefix: {prefix}" + ) + + return max_block + + +def get_hypersync_client(): + """Initialize HyperSync client with bearer token authentication.""" + bearer_token = os.getenv("HYPERSYNC_BEARER_TOKEN") + if not bearer_token: + raise ValueError("HYPERSYNC_BEARER_TOKEN environment variable is required") + + return hypersync.HypersyncClient( + hypersync.ClientConfig( + url="https://eth.hypersync.xyz", + bearer_token=bearer_token, + ) + ) + + +async def collect_logs( + contract: str, from_block: int, to_block: int | None, data_dir: str +): + """ + Collect historical logs data for a contract and save to Parquet. + + Args: + contract: Contract address to query + from_block: Starting block number + to_block: Ending block number (None for latest) + data_dir: Directory to save parquet files + """ + client = get_hypersync_client() + + # Build query + query = hypersync.preset_query_logs( + contract, from_block=from_block, to_block=to_block + ) + + print(f"Collecting logs for contract: {contract}") + print(f"From block: {from_block}") + print(f"To block: {to_block if to_block else 'latest'}") + + await client.collect_parquet( + query=query, path=data_dir, config=hypersync.StreamConfig() + ) + + print("Collection complete!") + + +# ============================================================================ +# Combined DAG: Collect Logs and Load to Snowflake +# ============================================================================ + + +@dag( + dag_id="collect_and_load_logs_dag", + schedule=None, # Disabled - triggered by orchestrate_load_and_dbt DAG + start_date=pendulum.datetime(2024, 1, 1, tz="UTC"), + catchup=False, + tags=["hypersync", "logs", "incremental", "snowflake", "etl"], + max_active_runs=1, + description="Incrementally collect blockchain logs from HyperSync and load to Snowflake", + params={ + "from_block": None, + "to_block": None, + "contract": "0x323c03c48660fE31186fa82c289b0766d331Ce21", + "prefix": "open_logs", + "snowflake_schema": "raw", + "snowflake_table": "open_logs", + "snowflake_write_disposition": "append", + }, +) +def collect_and_load_logs_dag(): + """ + Combined ETL DAG: Collect logs from HyperSync and load to Snowflake. + + This DAG: + 1. Auto-detects the latest fetched block from existing files + 2. Collects new logs from that block to the current block + 3. Saves the result with today's date and block range in the filename + 4. Loads the collected parquet file to Snowflake + + Parameters (configurable via UI or CLI): + - from_block: Starting block number (None = auto-detect from existing files) + - to_block: Ending block number (None = latest chain block) + - contract: Contract address to query + - prefix: Filename prefix for saved parquet files + """ + + @task + def collect_logs_incremental(**context) -> dict: + """ + Collect logs incrementally from the latest fetched block to the current block. + + Uses DAG params that can be overridden via UI config or CLI. + + Returns: + Dictionary with collection metadata + """ + # Get parameters from DAG run config + params = context["params"] + contract = params.get("contract", "0x323c03c48660fE31186fa82c289b0766d331Ce21") + prefix = params.get("prefix", "open_logs") + from_block = params.get("from_block") + to_block = params.get("to_block") + data_dir = "/opt/airflow/.data" + + # Ensure data directory exists + Path(data_dir).mkdir(parents=True, exist_ok=True) + + # Auto-detect starting block if not provided + if from_block is None: + latest_block = get_latest_fetched_block(prefix, data_dir) + + if latest_block is None: + raise ValueError( + f"No existing data found for prefix '{prefix}'. " + "Please provide from_block for initial data collection." + ) + + # Start from the next block after the latest fetched + from_block = latest_block + print(f"Auto-detected starting block: {from_block}") + + print(f"=== Incremental Log Collection ===") + print(f"Contract: {contract}") + print(f"Prefix: {prefix}") + print(f"From block: {from_block}") + print(f"To block: {to_block if to_block else 'latest'}") + + # Collect data to temporary file + temp_file = Path(data_dir) / "logs.parquet" + asyncio.run( + collect_logs( + contract=contract, + from_block=from_block, + to_block=to_block, + data_dir=data_dir, + ) + ) + + # Check if file was created (HyperSync doesn't create file if no data) + if not temp_file.exists(): + print( + f"No new data found in block range {from_block} - {to_block if to_block else 'latest'}" + ) + print("✓ Incremental collection complete (no new data)") + return { + "status": "no_data", + "from_block": from_block, + "to_block": from_block, + "output_file": None, + } + + # Read actual block range from the collected data + try: + df = pl.scan_parquet(temp_file) + actual_from_block = df.select(pl.col("block_number").min()).collect().item() + actual_to_block = df.select(pl.col("block_number").max()).collect().item() + record_count = df.select(pl.len()).collect().item() + + print(f"Collected {record_count} records") + print(f"Actual block range: {actual_from_block} - {actual_to_block}") + + # Rename with today's date and actual block range + today = datetime.now().strftime("%y%m%d") + output_file = ( + Path(data_dir) + / f"{prefix}_{today}_{actual_from_block}_{actual_to_block}.parquet" + ) + temp_file.rename(output_file) + + print(f"✓ Saved to: {output_file}") + + return { + "status": "success", + "from_block": actual_from_block, + "to_block": actual_to_block, + "output_file": str(output_file), + "record_count": record_count, + } + + except Exception as e: + print(f"Failed to process collected data: {e}") + raise + + @task.bash + def load_to_snowflake(collection_result: dict, **context) -> str: + """ + Load parquet file directly to Snowflake table using load_file.py script. + + Args: + collection_result: Result dict from collect_logs_incremental containing output_file path + + Returns: + Bash command to execute + """ + params = context["params"] + snowflake_schema = params.get("snowflake_schema", "raw") + snowflake_table = params.get("snowflake_table", "open_logs") + snowflake_write_disposition = params.get( + "snowflake_write_disposition", "append" + ) + + # Extract file path from collection result + file_path = collection_result.get("output_file") + + if not file_path: + print("No new data collected, skipping Snowflake load") + return "echo 'No new data to load'" + + return f""" + cd /opt/airflow && \ + python scripts/load_file.py \ + -f {file_path} \ + -c snowflake \ + -s {snowflake_schema} \ + -t {snowflake_table} \ + -w {snowflake_write_disposition} \ + """ + + # Define task dependencies: Collect → Load + collection_result = collect_logs_incremental() + load_to_snowflake(collection_result) + + +# Create DAG instance +collect_and_load_logs_dag() diff --git a/airflow/dags/collect_logs_data.py b/airflow/dags/collect_logs_data.py deleted file mode 100644 index 6621bee..0000000 --- a/airflow/dags/collect_logs_data.py +++ /dev/null @@ -1,123 +0,0 @@ -import hypersync -import asyncio -from dotenv import load_dotenv -import os, logging -import argparse -import polars as pl - -load_dotenv() - -logging.basicConfig(level=logging.INFO) -logger = logging.getLogger(__name__) - - -def get_client(): - return hypersync.HypersyncClient( - hypersync.ClientConfig( - url="https://eth.hypersync.xyz", - bearer_token=os.getenv("HYPERSYNC_BEARER_TOKEN"), - ) - ) - - -async def collect_logs(contract: str, from_block: int, to_block: int | None = None): - """ - Collect historical logs data for a contract and save to Parquet. - - Args: - contract: Contract address to query - from_block: Starting block number - to_block: Ending block number (None for latest) - """ - client = get_client() - - # Build query - query = hypersync.preset_query_logs( - contract, from_block=from_block, to_block=to_block - ) - - logger.info(f"Collecting logs for contract: {contract}") - logger.info(f"From block: {from_block}") - logger.info(f"To block: {to_block if to_block else 'latest'}") - - # Run the query - automatically paginated - # The query returns when it reaches some limit (time, response size etc.) - # There is a next_block field on the response so we can continue until - # res.next_block equals res.archive_height or query.to_block - await client.collect_parquet( - query=query, path=".data", config=hypersync.StreamConfig() - ) - - logger.info("Collection complete!") - - -def main(): - parser = argparse.ArgumentParser( - description="Collect historical blockchain logs data using HyperSync" - ) - parser.add_argument( - "--contract", - "-c", - required=True, - help="Contract address to query (e.g., 0x...)", - ) - parser.add_argument( - "--from-block", "-f", type=int, required=True, help="Starting block number" - ) - parser.add_argument( - "--to-block", - "-t", - type=int, - default=None, - help="Ending block number (default: None for latest)", - ) - parser.add_argument( - "--contract-name", - "-n", - type=str, - default=None, - help="Contract name to use in the filename", - ) - - args = parser.parse_args() - - asyncio.run( - collect_logs( - contract=args.contract, from_block=args.from_block, to_block=args.to_block - ) - ) - - # Rename parquet file with block range suffix - source_path = f".data/logs.parquet" - # if args.contract_name: - # source_path = f".data/{args.contract_name}.parquet" - - try: - # Read the actual to_block from the parquet file - to_block_actual = ( - pl.scan_parquet(source_path) - .select(pl.col("block_number").max()) - .collect() - .item() - ) - - # Construct new filename with block range - dest_path = ( - f".data/{args.contract_name}_{args.from_block}_{to_block_actual}.parquet" - ) - if args.contract_name: - dest_path = f".data/{args.contract_name}_logs_{args.from_block}_{to_block_actual}.parquet" - - os.rename(source_path, dest_path) - logger.info(f"Renamed file to: {dest_path}") - - except FileNotFoundError: - logger.error(f"Parquet file not found: {source_path}") - raise - except Exception as e: - logger.error(f"Failed to rename parquet file: {e}") - raise - - -if __name__ == "__main__": - main() diff --git a/airflow/dags/dbt_sf.py b/airflow/dags/dbt_sf.py index ae56a84..3e604c4 100644 --- a/airflow/dags/dbt_sf.py +++ b/airflow/dags/dbt_sf.py @@ -41,23 +41,23 @@ def get_dbt_snowflake_env_vars(): "SNOWFLAKE_WAREHOUSE": extra.get("warehouse", ""), "SNOWFLAKE_DATABASE": extra.get("database", ""), "SNOWFLAKE_SCHEMA": conn.schema or "", - "SNOWFLAKE_PRIVATE_KEY_file": private_key, - "DBT_PROFILES_DIR": "/opt/airflow/dbt_sf", - "DBT_PROJECT_DIR": "/opt/airflow/dbt_sf", + "SNOWFLAKE_PRIVATE_KEY_FILE": private_key, + "DBT_PROFILES_DIR": "/opt/airflow/dbt_project", + "DBT_PROJECT_DIR": "/opt/airflow/dbt_project", "PATH": "/home/airflow/.local/bin:" + os.environ.get("PATH", ""), } @dag( - dag_id="capstone_dbt_orchestration", - schedule="0 2 * * *", # Daily at 2 AM + dag_id="dbt_sf", + schedule=None, # Disabled - triggered by orchestrate_load_and_dbt DAG start_date=pendulum.datetime(2024, 1, 1, tz="UTC"), catchup=False, tags=["capstone", "dbt", "orchestration"], max_active_runs=1, description="Simple dbt orchestration using bash commands - matches capstone implementation", ) -def capstone_dbt_orchestration(): +def dbt_sf(): """ ### Capstone dbt Orchestration DAG @@ -73,11 +73,11 @@ def dbt_build() -> str: Run dbt build using bash command. This matches exactly what's in the capstone project. """ - return "cd dbt_project && dbt build" + return "dbt build" # Execute the task dbt_build() # Create the DAG instance -capstone_dbt_orchestration() +dbt_sf() diff --git a/airflow/dags/hello.py b/airflow/dags/hello.py deleted file mode 100644 index cf1ba92..0000000 --- a/airflow/dags/hello.py +++ /dev/null @@ -1,37 +0,0 @@ -""" -### Capstone dbt Orchestration DAG -A simple dbt orchestration DAG that matches the actual capstone implementation. -Uses @task.bash with dbt build command, following the capstone pattern. - -This DAG demonstrates: -- Simple dbt integration using bash commands -- Environment variable configuration from Airflow connections -- The recommended 1 task = 1 dbt pipeline approach -""" - -from airflow.decorators import dag, task -import pendulum - - -@dag( - dag_id="hello_world", - schedule="0 2 * * *", # Daily at 2 AM - start_date=pendulum.datetime(2024, 1, 1, tz="UTC"), - catchup=False, - tags=["capstone", "dbt", "orchestration"], - max_active_runs=1, - description="Hello World DAG", -) -@task -def hello_world(): - """ - ### Hello World DAG - - This DAG prints "Hello World" to the console. - """ - - print("Hello World") - - -# Create the DAG instance -hello_world() diff --git a/airflow/dags/orchestrate_load_and_dbt.py b/airflow/dags/orchestrate_load_and_dbt.py new file mode 100644 index 0000000..e89336e --- /dev/null +++ b/airflow/dags/orchestrate_load_and_dbt.py @@ -0,0 +1,64 @@ +""" +Master Orchestration DAG: Log Collection → dbt Transformation + +This DAG orchestrates the complete data pipeline: +1. Trigger and wait for collect_and_load_logs_dag to complete +2. Trigger dbt_sf transformation + +Uses TriggerDagRunOperator to chain DAG execution while keeping them modular. + +IMPORTANT: This is the ONLY DAG that should be scheduled. +- collect_and_load_logs_dag: schedule=None (triggered by this DAG) +- dbt_sf: schedule=None (triggered by this DAG) +""" + +import pendulum +from airflow.decorators import dag +from airflow.operators.trigger_dagrun import TriggerDagRunOperator + + +@dag( + dag_id="orchestrate_load_and_dbt", + schedule="0 0 * * *", # Daily at 2 AM UTC + start_date=pendulum.datetime(2024, 1, 1, tz="UTC"), + catchup=False, + tags=["orchestration", "master", "logs", "dbt"], + max_active_runs=1, + description="Master DAG: Orchestrates log collection followed by dbt transformation", +) +def orchestrate_load_and_dbt(): + """ + ### Master Pipeline Orchestration + + This DAG ensures that: + 1. Log collection and loading completes successfully + 2. dbt transformations run on the freshly loaded data + + The pipeline stops if log collection fails, preventing dbt from running on stale data. + """ + + # Step 1: Trigger log collection DAG and wait for completion + trigger_log_collection = TriggerDagRunOperator( + task_id="trigger_log_collection", + trigger_dag_id="collect_and_load_logs_dag", + wait_for_completion=True, # Blocks until DAG completes + poke_interval=30, # Check every 30 seconds + reset_dag_run=True, # Reset if already running + ) + + # Step 2: Trigger dbt transformation DAG + trigger_dbt_transformation = TriggerDagRunOperator( + task_id="trigger_dbt_transformation", + trigger_dag_id="dbt_sf", + wait_for_completion=True, + poke_interval=30, + reset_dag_run=True, + ) + + # Define pipeline flow: Log Collection → dbt Transform + # No sensor needed - TriggerDagRunOperator already waits for completion + trigger_log_collection >> trigger_dbt_transformation + + +# Create DAG instance +orchestrate_load_and_dbt() diff --git a/airflow/requirements-airflow.txt b/airflow/requirements-airflow.txt deleted file mode 100644 index 99f2ef1..0000000 --- a/airflow/requirements-airflow.txt +++ /dev/null @@ -1,786 +0,0 @@ -# This file was autogenerated by uv via the following command: -# uv export --no-dev --format requirements-txt -agate==1.9.1 \ - --hash=sha256:1cf329510b3dde07c4ad1740b7587c9c679abc3dcd92bb1107eabc10c2e03c50 \ - --hash=sha256:bc60880c2ee59636a2a80cd8603d63f995be64526abf3cbba12f00767bcd5b3d - # via - # dbt-adapters - # dbt-common - # dbt-core - # dbt-snowflake -annotated-types==0.7.0 \ - --hash=sha256:1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53 \ - --hash=sha256:aff07c09a53a08bc8cfccb9c85b05f1aa9a2a6f23728d790723543408344ce89 - # via pydantic -asn1crypto==1.5.1 \ - --hash=sha256:13ae38502be632115abf8a24cbe5f4da52e3b5231990aff31123c805306ccb9c \ - --hash=sha256:db4e40728b728508912cbb3d44f19ce188f218e9eba635821bb4b68564f8fd67 - # via snowflake-connector-python -attrs==25.3.0 \ - --hash=sha256:427318ce031701fea540783410126f03899a97ffc6f61596ad581ac2e40e3bc3 \ - --hash=sha256:75d7cefc7fb576747b2c81b4442d4d4a1ce0900973527c011d1030fd3bf4af1b - # via - # jsonschema - # referencing -babel==2.17.0 \ - --hash=sha256:0c54cffb19f690cdcc52a3b50bcbf71e07a808d1c80d549f2459b9d2cf0afb9d \ - --hash=sha256:4d0b53093fdfb4b21c92b5213dba5a1b23885afa8383709427046b21c366e5f2 - # via agate -boto3==1.40.21 \ - --hash=sha256:3772fb828864d3b7046c8bdf2f4860aaca4a79f25b7b060206c6a5f4944ea7f9 \ - --hash=sha256:876ccc0b25517b992bd27976282510773a11ebc771aa5b836a238ea426c82187 - # via snowflake-connector-python -botocore==1.40.21 \ - --hash=sha256:574ecf9b68c1721650024a27e00e0080b6f141c281ebfce49e0d302969270ef4 \ - --hash=sha256:f77e9c199df0252b14ea739a9ac99723940f6bde90f4c2e7802701553a62827b - # via - # boto3 - # s3transfer - # snowflake-connector-python -certifi==2025.1.31 \ - --hash=sha256:3d5da6925056f6f18f119200434a4780a94263f10d1c21d032a6f6b2baa20651 \ - --hash=sha256:ca78db4565a652026a4db2bcdf68f2fb589ea80d0be70e03929ed730746b84fe - # via - # dbt-snowflake - # requests - # snowflake-connector-python -cffi==1.17.1 \ - --hash=sha256:0984a4925a435b1da406122d4d7968dd861c1385afe3b45ba82b750f229811e2 \ - --hash=sha256:1257bdabf294dceb59f5e70c64a3e2f462c30c7ad68092d01bbbfb1c16b1ba36 \ - --hash=sha256:1c39c6016c32bc48dd54561950ebd6836e1670f2ae46128f67cf49e789c52824 \ - --hash=sha256:386c8bf53c502fff58903061338ce4f4950cbdcb23e2902d86c0f722b786bbe3 \ - --hash=sha256:3edc8d958eb099c634dace3c7e16560ae474aa3803a5df240542b305d14e14ed \ - --hash=sha256:4ceb10419a9adf4460ea14cfd6bc43d08701f0835e979bf821052f1805850fe8 \ - --hash=sha256:51392eae71afec0d0c8fb1a53b204dbb3bcabcb3c9b807eedf3e1e6ccf2de903 \ - --hash=sha256:706510fe141c86a69c8ddc029c7910003a17353970cff3b904ff0686a5927683 \ - --hash=sha256:72e72408cad3d5419375fc87d289076ee319835bdfa2caad331e377589aebba9 \ - --hash=sha256:733e99bc2df47476e3848417c5a4540522f234dfd4ef3ab7fafdf555b082ec0c \ - --hash=sha256:805b4371bf7197c329fcb3ead37e710d1bca9da5d583f5073b799d5c5bd1eee4 \ - --hash=sha256:a08d7e755f8ed21095a310a693525137cfe756ce62d066e53f502a83dc550f65 \ - --hash=sha256:b62ce867176a75d03a665bad002af8e6d54644fad99a3c70905c543130e39d93 \ - --hash=sha256:c59d6e989d07460165cc5ad3c61f9fd8f1b4796eacbd81cee78957842b834af4 \ - --hash=sha256:d01b12eeeb4427d3110de311e1774046ad344f5b1a7403101878976ecd7a10f3 \ - --hash=sha256:d63afe322132c194cf832bfec0dc69a99fb9bb6bbd550f161a49e9e855cc78ff \ - --hash=sha256:da95af8214998d77a98cc14e3a3bd00aa191526343078b530ceb0bd710fb48a5 \ - --hash=sha256:dd398dbc6773384a17fe0d3e7eeb8d1a21c2200473ee6806bb5e6a8e62bb73dd \ - --hash=sha256:de55b766c7aa2e2a3092c51e0483d700341182f08e67c63630d5b6f200bb28e5 \ - --hash=sha256:e03eab0a8677fa80d646b5ddece1cbeaf556c313dcfac435ba11f107ba117b5d \ - --hash=sha256:f3a2b4222ce6b60e2e8b337bb9596923045681d71e5a082783484d845390938e \ - --hash=sha256:f6a16c31041f09ead72d69f583767292f750d24913dadacf5756b966aacb3f1a \ - --hash=sha256:f79fc4fc25f1c8698ff97788206bb3c2598949bfe0fef03d299eb1b5356ada99 - # via - # cryptography - # snowflake-connector-python -charset-normalizer==3.4.3 \ - --hash=sha256:027b776c26d38b7f15b26a5da1044f376455fb3766df8fc38563b4efbc515154 \ - --hash=sha256:0cacf8f7297b0c4fcb74227692ca46b4a5852f8f4f24b3c766dd94a1075c4884 \ - --hash=sha256:14c2a87c65b351109f6abfc424cab3927b3bdece6f706e4d12faaf3d52ee5efe \ - --hash=sha256:1606f4a55c0fd363d754049cdf400175ee96c992b1f8018b993941f221221c5f \ - --hash=sha256:18343b2d246dc6761a249ba1fb13f9ee9a2bcd95decc767319506056ea4ad4dc \ - --hash=sha256:18b97b8404387b96cdbd30ad660f6407799126d26a39ca65729162fd810a99aa \ - --hash=sha256:1bb60174149316da1c35fa5233681f7c0f9f514509b8e399ab70fea5f17e45c9 \ - --hash=sha256:2001a39612b241dae17b4687898843f254f8748b796a2e16f1051a17078d991d \ - --hash=sha256:30a96e1e1f865f78b030d65241c1ee850cdf422d869e9028e2fc1d5e4db73b92 \ - --hash=sha256:30d006f98569de3459c2fc1f2acde170b7b2bd265dc1943e87e1a4efe1b67c31 \ - --hash=sha256:320e8e66157cc4e247d9ddca8e21f427efc7a04bbd0ac8a9faf56583fa543f9f \ - --hash=sha256:3cd35b7e8aedeb9e34c41385fda4f73ba609e561faedfae0a9e75e44ac558a15 \ - --hash=sha256:3cfb2aad70f2c6debfbcb717f23b7eb55febc0bb23dcffc0f076009da10c6392 \ - --hash=sha256:416175faf02e4b0810f1f38bcb54682878a4af94059a1cd63b8747244420801f \ - --hash=sha256:41d1fc408ff5fdfb910200ec0e74abc40387bccb3252f3f27c0676731df2b2c8 \ - --hash=sha256:42e5088973e56e31e4fa58eb6bd709e42fc03799c11c42929592889a2e54c491 \ - --hash=sha256:53cd68b185d98dde4ad8990e56a58dea83a4162161b1ea9272e5c9182ce415e0 \ - --hash=sha256:6aab0f181c486f973bc7262a97f5aca3ee7e1437011ef0c2ec04b5a11d16c927 \ - --hash=sha256:6fb70de56f1859a3f71261cbe41005f56a7842cc348d3aeb26237560bfa5e0ce \ - --hash=sha256:6fce4b8500244f6fcb71465d4a4930d132ba9ab8e71a7859e6a5d59851068d14 \ - --hash=sha256:73dc19b562516fc9bcf6e5d6e596df0b4eb98d87e4f79f3ae71840e6ed21361c \ - --hash=sha256:86df271bf921c2ee3818f0522e9a5b8092ca2ad8b065ece5d7d9d0e9f4849bcc \ - --hash=sha256:8dcfc373f888e4fb39a7bc57e93e3b845e7f462dacc008d9749568b1c4ece096 \ - --hash=sha256:b89bc04de1d83006373429975f8ef9e7932534b8cc9ca582e4db7d20d91816db \ - --hash=sha256:bd28b817ea8c70215401f657edef3a8aa83c29d447fb0b622c35403780ba11d5 \ - --hash=sha256:c6dbd0ccdda3a2ba7c2ecd9d77b37f3b5831687d8dc1b6ca5f56a4880cc7b7ce \ - --hash=sha256:c6fd51128a41297f5409deab284fecbe5305ebd7e5a1f959bee1c054622b7018 \ - --hash=sha256:cc34f233c9e71701040d772aa7490318673aa7164a0efe3172b2981218c26d93 \ - --hash=sha256:ccf600859c183d70eb47e05a44cd80a4ce77394d1ac0f79dbd2dd90a69a3a049 \ - --hash=sha256:ce571ab16d890d23b5c278547ba694193a45011ff86a9162a71307ed9f86759a \ - --hash=sha256:cf1ebb7d78e1ad8ec2a8c4732c7be2e736f6e5123a4146c5b89c9d1f585f8cef \ - --hash=sha256:d716a916938e03231e86e43782ca7878fb602a125a91e7acb8b5112e2e96ac16 \ - --hash=sha256:e28e334d3ff134e88989d90ba04b47d84382a828c061d0d1027b1b12a62b39b1 \ - --hash=sha256:fb6fecfd65564f208cbf0fba07f107fb661bcd1a7c389edbced3f7a493f70e37 \ - --hash=sha256:fdabf8315679312cfa71302f9bd509ded4f2f263fb5b765cf1433b39106c3cc9 - # via - # requests - # snowflake-connector-python -click==8.2.1 \ - --hash=sha256:27c491cc05d968d271d5a1db13e3b5a184636d9d930f148c50b038f0d0646202 \ - --hash=sha256:61a3265b914e850b85317d0b3109c7f8cd35a670f963866005d6ef1d5175a12b - # via - # dbt-core - # dbt-semantic-interfaces -colorama==0.4.6 \ - --hash=sha256:08695f5cb7ed6e0531a20572697297273c47b8cae5a63ffc6d6ed5c201be6e44 \ - --hash=sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6 - # via - # click - # dbt-common -cryptography==45.0.7 \ - --hash=sha256:16ede8a4f7929b4b7ff3642eba2bf79aa1d71f24ab6ee443935c0d269b6bc513 \ - --hash=sha256:18fcf70f243fe07252dcb1b268a687f2358025ce32f9f88028ca5c364b123ef5 \ - --hash=sha256:1993a1bb7e4eccfb922b6cd414f072e08ff5816702a0bdb8941c247a6b1b287c \ - --hash=sha256:3808e6b2e5f0b46d981c24d79648e5c25c35e59902ea4391a0dcb3e667bf7443 \ - --hash=sha256:3994c809c17fc570c2af12c9b840d7cea85a9fd3e5c0e0491f4fa3c029216d59 \ - --hash=sha256:3be4f21c6245930688bd9e162829480de027f8bf962ede33d4f8ba7d67a00cee \ - --hash=sha256:465ccac9d70115cd4de7186e60cfe989de73f7bb23e8a7aa45af18f7412e75bf \ - --hash=sha256:48c41a44ef8b8c2e80ca4527ee81daa4c527df3ecbc9423c41a420a9559d0e27 \ - --hash=sha256:4b1654dfc64ea479c242508eb8c724044f1e964a47d1d1cacc5132292d851971 \ - --hash=sha256:4bd3e5c4b9682bc112d634f2c6ccc6736ed3635fc3319ac2bb11d768cc5a00d8 \ - --hash=sha256:577470e39e60a6cd7780793202e63536026d9b8641de011ed9d8174da9ca5339 \ - --hash=sha256:67285f8a611b0ebc0857ced2081e30302909f571a46bfa7a3cc0ad303fe015c6 \ - --hash=sha256:7285a89df4900ed3bfaad5679b1e668cb4b38a8de1ccbfc84b05f34512da0a90 \ - --hash=sha256:81823935e2f8d476707e85a78a405953a03ef7b7b4f55f93f7c2d9680e5e0691 \ - --hash=sha256:8978132287a9d3ad6b54fcd1e08548033cc09dc6aacacb6c004c73c3eb5d3ac3 \ - --hash=sha256:a24ee598d10befaec178efdff6054bc4d7e883f615bfbcd08126a0f4931c83a6 \ - --hash=sha256:b04f85ac3a90c227b6e5890acb0edbaf3140938dbecf07bff618bf3638578cf1 \ - --hash=sha256:b6a0e535baec27b528cb07a119f321ac024592388c5681a5ced167ae98e9fff3 \ - --hash=sha256:bef32a5e327bd8e5af915d3416ffefdbe65ed975b646b3805be81b23580b57b8 \ - --hash=sha256:bfb4c801f65dd61cedfc61a83732327fafbac55a47282e6f26f073ca7a41c3b2 \ - --hash=sha256:ce7a453385e4c4693985b4a4a3533e041558851eae061a58a5405363b098fcd3 \ - --hash=sha256:dad43797959a74103cb59c5dac71409f9c27d34c8a05921341fb64ea8ccb1dd4 \ - --hash=sha256:dd342f085542f6eb894ca00ef70236ea46070c8a13824c6bde0dfdcd36065b9b \ - --hash=sha256:f3df7b3d0f91b88b2106031fd995802a2e9ae13e02c36c1fc075b43f420f3a17 \ - --hash=sha256:fa26fa54c0a9384c27fcdc905a2fb7d60ac6e47d14bc2692145f2b3b1e2cfdbd - # via - # pyopenssl - # secretstorage - # snowflake-connector-python -daff==1.4.2 \ - --hash=sha256:47f0391eda7e2b5011f7ccac006b9178accb465bcb94a2c9f284257fff5d2686 \ - --hash=sha256:88981a21d065e4378b5c4bd40b975dbfdea9b7ff540071f3bb5e20cc8b3590b5 - # via dbt-core -dbt-adapters==1.16.5 \ - --hash=sha256:3803c60bcf16bc1cbfdec1b20cee290072d861edbe93b97b3e92a935c57e21d3 \ - --hash=sha256:f9128745a0592e0047b37acc99deb12a7aa250f8780969e998b09f7219c03d8c - # via - # dbt-core - # dbt-snowflake -dbt-common==1.28.0 \ - --hash=sha256:1c533bdc4327cc4890e033822102fd2319c98448a6de5f5a03c7bdd7d50f7ba0 \ - --hash=sha256:31f42029cccfdfe072d36abe649a479bad20b86ef711fe32de72411b9acd45c9 - # via - # dbt-adapters - # dbt-core - # dbt-snowflake -dbt-core==1.10.10 \ - --hash=sha256:973fd5b7c3b3744fcda323f393f67728f4b09fda50ddd215d257c9c4c9be648e \ - --hash=sha256:d6003bf56e4cdee4088e6d51f50328976590e82b5e45d142a3bc26dec3ad2e4b - # via dbt-snowflake -dbt-extractor==0.6.0 \ - --hash=sha256:05bcfab7ebd70296ceb31742e8333ba66a2c939de44e61a7088bebafa939aaf6 \ - --hash=sha256:080fd1edf123926ed97929c65a75874d0fea687ccd5d3ebbc9e81b339f099604 \ - --hash=sha256:1b9ed7b15df983a735f87773f6765db8458680c02fcebbf89df4e238503c0e08 \ - --hash=sha256:311f0d3a4994751c541a4fa303d205727ba90e90c85286c03d3d9284e2bf0bd4 \ - --hash=sha256:369dcc3499f160256756585783f1308868076d5a65d0a051348d22da8b90e67d \ - --hash=sha256:4b6b1e70dde78cb904ca7a8958c2c803e77779b6ce108f4ea7ac479f5700db89 \ - --hash=sha256:71b3f8897138cc6698d313b9a3d0450fd021937ff5463269ee18ed415541781b \ - --hash=sha256:868af715a6328d7317ce6e4db238f850f660fef13fb36b7ab4cf9163ed5f54ff \ - --hash=sha256:a5cb810edc60c0486f78cc29739ebda70c81b10a1686861e78addc9f91fcd7de \ - --hash=sha256:a79a570fdcb672505ac2bdc12360a2a7aec622ef604d8c607225854ff862518c \ - --hash=sha256:aecfa43f7e6f139e76d47e4e1d7b189655ae19a8cf697686230bacb89a94ae74 \ - --hash=sha256:af451633390ac19669d3bde6c79822e657d32f5d903b3388bb00d56333fd52d5 \ - --hash=sha256:c1fd2b083a75e80b13e9874dc9699bfdfddf3baa9b6a8dea48de06d51a082733 \ - --hash=sha256:caeaba8d8c813f8e32d586c12615c0c7d6b99bee4f1be845312e80ef731de164 \ - --hash=sha256:d6cf08ec793b8bc2bd6e260ef818230ae68a4f71436fa489f08d7db1a52e2ffe \ - --hash=sha256:dcf14ed245de8df269815ff4c4f555fa72d2621f4fff37c023b8c99d0e421b4f - # via dbt-core -dbt-protos==1.0.348 \ - --hash=sha256:5d89171b64058170b39a80fc69abbd344170ac637c5dee90e7b3cbd7b1a43fe4 \ - --hash=sha256:d8834c685c323b1533c1c7fdb5db5636cffab2f1a58515caba54b3dc13a3578b - # via - # dbt-adapters - # dbt-common - # dbt-core -dbt-semantic-interfaces==0.9.0 \ - --hash=sha256:1b54c06ba89190a47a7f0563360930a0cce869e55b484ca09d261ade0e319155 \ - --hash=sha256:5c921257dce8bb51c9ffb5479f2bdd959e16ebfb98ee833de6daa70788c47271 - # via dbt-core -dbt-snowflake==1.10.0 \ - --hash=sha256:6391fb0139bc6e7b65e18685e65f436624611eddaadbf5da202a7e3d1f72c08c \ - --hash=sha256:faa2eaa0155d2d104f058524886064c2a4d3dafdd876dc41471b8ab4119b172b - # via fa-dae2-lab-m03w02 -deepdiff==7.0.1 \ - --hash=sha256:260c16f052d4badbf60351b4f77e8390bee03a0b516246f6839bc813fb429ddf \ - --hash=sha256:447760081918216aa4fd4ca78a4b6a848b81307b2ea94c810255334b759e1dc3 - # via dbt-common -filelock==3.19.1 \ - --hash=sha256:66eda1888b0171c998b35be2bcc0f6d75c388a7ce20c3f3f37aa8e96c2dddf58 \ - --hash=sha256:d38e30481def20772f5baf097c122c3babc4fcdb7e14e57049eb9d88c6dc017d - # via snowflake-connector-python -idna==3.10 \ - --hash=sha256:12f65c9b470abda6dc35cf8e63cc574b1c52b11df2c86030af0ac09b01b13ea9 \ - --hash=sha256:946d195a0d259cbba61165e88e65941f16e9b36ea6ddb97f00452bae8b1287d3 - # via - # requests - # snowflake-connector-python -importlib-metadata==8.7.0 \ - --hash=sha256:d13b81ad223b890aa16c5471f2ac3056cf76c5f10f82d6f9292f0b415f389000 \ - --hash=sha256:e5dd1551894c77868a30651cef00984d50e1002d06942a7101d34870c5f02afd - # via dbt-semantic-interfaces -isodate==0.6.1 \ - --hash=sha256:0751eece944162659049d35f4f549ed815792b38793f07cf73381c1c87cbed96 \ - --hash=sha256:48c5881de7e8b0a0d648cb024c8062dc84e7b840ed81e864c7614fd3c127bde9 - # via - # agate - # dbt-common -jaraco-classes==3.4.0 \ - --hash=sha256:47a024b51d0239c0dd8c8540c6c7f484be3b8fcf0b2d85c13825780d3b3f3acd \ - --hash=sha256:f662826b6bed8cace05e7ff873ce0f9283b5c924470fe664fff1c2f00f581790 - # via keyring -jaraco-context==6.0.1 \ - --hash=sha256:9bae4ea555cf0b14938dc0aee7c9f32ed303aa20a3b73e7dc80111628792d1b3 \ - --hash=sha256:f797fc481b490edb305122c9181830a3a5b76d84ef6d1aef2fb9b47ab956f9e4 - # via keyring -jaraco-functools==4.3.0 \ - --hash=sha256:227ff8ed6f7b8f62c56deff101545fa7543cf2c8e7b82a7c2116e672f29c26e8 \ - --hash=sha256:cfd13ad0dd2c47a3600b439ef72d8615d482cedcff1632930d6f28924d92f294 - # via keyring -jeepney==0.9.0 ; sys_platform == 'linux' \ - --hash=sha256:97e5714520c16fc0a45695e5365a2e11b81ea79bba796e26f9f1d178cb182683 \ - --hash=sha256:cf0e9e845622b81e4a28df94c40345400256ec608d0e55bb8a3feaa9163f5732 - # via - # keyring - # secretstorage -jinja2==3.1.6 \ - --hash=sha256:0137fb05990d35f1275a587e9aee6d56da821fc83491a0fb838183be43f66d6d \ - --hash=sha256:85ece4451f492d0c13c5dd7c13a64681a86afae63a5f347908daf103ce6d2f67 - # via - # dbt-common - # dbt-core - # dbt-semantic-interfaces -jmespath==1.0.1 \ - --hash=sha256:02e2e4cc71b5bcab88332eebf907519190dd9e6e82107fa7f83b1003a6252980 \ - --hash=sha256:90261b206d6defd58fdd5e85f478bf633a2901798906be2ad389150c5c60edbe - # via - # boto3 - # botocore -jsonschema==4.25.1 \ - --hash=sha256:3fba0169e345c7175110351d456342c364814cfcf3b964ba4587f22915230a63 \ - --hash=sha256:e4a9655ce0da0c0b67a085847e00a3a51449e1157f4f75e9fb5aa545e122eb85 - # via - # dbt-common - # dbt-core - # dbt-semantic-interfaces -jsonschema-specifications==2025.4.1 \ - --hash=sha256:4653bffbd6584f7de83a67e0d620ef16900b390ddc7939d56684d6c81e33f1af \ - --hash=sha256:630159c9f4dbea161a6a2205c3011cc4f18ff381b189fff48bb39b9bf26ae608 - # via jsonschema -kafka-python==2.2.15 \ - --hash=sha256:84c0993cd4f7f2f01e92d8104ea9bdf631aff72fc5e6ea62eb3bdf1d56528fc3 \ - --hash=sha256:e0f480a45f3814cb0eb705b8b4f61069e1be61dae0d8c69d0f1f2da33eea1bd5 - # via fa-dae2-lab-m03w02 -keyring==25.6.0 \ - --hash=sha256:0b39998aa941431eb3d9b0d4b2460bc773b9df6fed7621c2dfb291a7e0187a66 \ - --hash=sha256:552a3f7af126ece7ed5c89753650eec89c7eaae8617d0aa4d9ad2b75111266bd - # via snowflake-connector-python -leather==0.4.0 \ - --hash=sha256:18290bc93749ae39039af5e31e871fcfad74d26c4c3ea28ea4f681f4571b3a2b \ - --hash=sha256:f964bec2086f3153a6c16e707f20cb718f811f57af116075f4c0f4805c608b95 - # via agate -markupsafe==3.0.2 \ - --hash=sha256:0f4ca02bea9a23221c0182836703cbf8930c5e9454bacce27e767509fa286a30 \ - --hash=sha256:131a3c7689c85f5ad20f9f6fb1b866f402c445b220c19fe4308c0b147ccd2ad9 \ - --hash=sha256:15ab75ef81add55874e7ab7055e9c397312385bd9ced94920f2802310c930396 \ - --hash=sha256:1c99d261bd2d5f6b59325c92c73df481e05e57f19837bdca8413b9eac4bd8028 \ - --hash=sha256:2181e67807fc2fa785d0592dc2d6206c019b9502410671cc905d132a92866557 \ - --hash=sha256:3d79d162e7be8f996986c064d1c7c817f6df3a77fe3d6859f6f9e7be4b8c213a \ - --hash=sha256:444dcda765c8a838eaae23112db52f1efaf750daddb2d9ca300bcae1039adc5c \ - --hash=sha256:4aa4e5faecf353ed117801a068ebab7b7e09ffb6e1d5e412dc852e0da018126c \ - --hash=sha256:52305740fe773d09cffb16f8ed0427942901f00adedac82ec8b67752f58a1b22 \ - --hash=sha256:569511d3b58c8791ab4c2e1285575265991e6d8f8700c7be0e88f86cb0672094 \ - --hash=sha256:6381026f158fdb7c72a168278597a5e3a5222e83ea18f543112b2662a9b699c5 \ - --hash=sha256:846ade7b71e3536c4e56b386c2a47adf5741d2d8b94ec9dc3e92e5e1ee1e2225 \ - --hash=sha256:88416bd1e65dcea10bc7569faacb2c20ce071dd1f87539ca2ab364bf6231393c \ - --hash=sha256:8e06879fc22a25ca47312fbe7c8264eb0b662f6db27cb2d3bbbc74b1df4b9b87 \ - --hash=sha256:9778bd8ab0a994ebf6f84c2b949e65736d5575320a17ae8984a77fab08db94cf \ - --hash=sha256:a904af0a6162c73e3edcb969eeeb53a63ceeb5d8cf642fade7d39e7963a22ddb \ - --hash=sha256:ad10d3ded218f1039f11a75f8091880239651b52e9bb592ca27de44eed242a48 \ - --hash=sha256:b5a6b3ada725cea8a5e634536b1b01c30bcdcd7f9c6fff4151548d5bf6b3a36c \ - --hash=sha256:ba8062ed2cf21c07a9e295d5b8a2a5ce678b913b45fdf68c32d95d6c1291e0b6 \ - --hash=sha256:ba9527cdd4c926ed0760bc301f6728ef34d841f405abf9d4f959c478421e4efd \ - --hash=sha256:bcf3e58998965654fdaff38e58584d8937aa3096ab5354d493c77d1fdd66d7a1 \ - --hash=sha256:c0ef13eaeee5b615fb07c9a7dadb38eac06a0608b41570d8ade51c56539e509d \ - --hash=sha256:cabc348d87e913db6ab4aa100f01b08f481097838bdddf7c7a84b7575b7309ca \ - --hash=sha256:cdb82a876c47801bb54a690c5ae105a46b392ac6099881cdfb9f6e95e4014c6a \ - --hash=sha256:d16a81a06776313e817c951135cf7340a3e91e8c1ff2fac444cfd75fffa04afe \ - --hash=sha256:e17c96c14e19278594aa4841ec148115f9c7615a47382ecb6b82bd8fea3ab0c8 \ - --hash=sha256:e444a31f8db13eb18ada366ab3cf45fd4b31e4db1236a4448f68778c1d1a5a2f \ - --hash=sha256:e6a2a455bd412959b57a172ce6328d2dd1f01cb2135efda2e4576e8a23fa3b0f \ - --hash=sha256:ee55d3edf80167e48ea11a923c7386f4669df67d7994554387f84e7d8b0a2bf0 \ - --hash=sha256:f3818cb119498c0678015754eba762e0d61e5b52d34c8b13d770f0719f7b1d79 \ - --hash=sha256:f8b3d067f2e40fe93e1ccdd6b2e1d16c43140e76f02fb1319a05cf2b79d99430 - # via jinja2 -mashumaro==3.14 \ - --hash=sha256:5ef6f2b963892cbe9a4ceb3441dfbea37f8c3412523f25d42e9b3a7186555f1d \ - --hash=sha256:c12a649599a8f7b1a0b35d18f12e678423c3066189f7bc7bd8dd431c5c8132c3 - # via - # dbt-adapters - # dbt-common - # dbt-core -more-itertools==10.7.0 \ - --hash=sha256:9fddd5403be01a94b204faadcff459ec3568cf110265d3c54323e1e866ad29d3 \ - --hash=sha256:d43980384673cb07d2f7d2d918c616b30c659c089ee23953f601d6609c67510e - # via - # dbt-semantic-interfaces - # jaraco-classes - # jaraco-functools -msgpack==1.1.1 \ - --hash=sha256:196a736f0526a03653d829d7d4c5500a97eea3648aebfd4b6743875f28aa2af8 \ - --hash=sha256:33be9ab121df9b6b461ff91baac6f2731f83d9b27ed948c5b9d1978ae28bf157 \ - --hash=sha256:3765afa6bd4832fc11c3749be4ba4b69a0e8d7b728f78e68120a157a4c5d41f0 \ - --hash=sha256:4147151acabb9caed4e474c3344181e91ff7a388b888f1e19ea04f7e73dc7ad5 \ - --hash=sha256:4df2311b0ce24f06ba253fda361f938dfecd7b961576f9be3f3fbd60e87130ac \ - --hash=sha256:4fd6b577e4541676e0cc9ddc1709d25014d3ad9a66caa19962c4f5de30fc09ef \ - --hash=sha256:500e85823a27d6d9bba1d057c871b4210c1dd6fb01fbb764e37e4e8847376323 \ - --hash=sha256:5692095123007180dca3e788bb4c399cc26626da51629a31d40207cb262e67f4 \ - --hash=sha256:6d489fba546295983abd142812bda76b57e33d0b9f5d5b71c09a583285506f69 \ - --hash=sha256:6f64ae8fe7ffba251fecb8408540c34ee9df1c26674c50c4544d72dbf792e5ce \ - --hash=sha256:77b79ce34a2bdab2594f490c8e80dd62a02d650b91a75159a63ec413b8d104cd \ - --hash=sha256:870b9a626280c86cff9c576ec0d9cbcc54a1e5ebda9cd26dab12baf41fee218c \ - --hash=sha256:8ddb2bcfd1a8b9e431c8d6f4f7db0773084e107730ecf3472f1dfe9ad583f3d9 \ - --hash=sha256:9d592d06e3cc2f537ceeeb23d38799c6ad83255289bb84c2e5792e5a8dea268a \ - --hash=sha256:a494554874691720ba5891c9b0b39474ba43ffb1aaf32a5dac874effb1619e1a \ - --hash=sha256:ae497b11f4c21558d95de9f64fff7053544f4d1a17731c866143ed6bb4591238 \ - --hash=sha256:b1ce7f41670c5a69e1389420436f41385b1aa2504c3b0c30620764b15dded2e7 \ - --hash=sha256:bb29aaa613c0a1c40d1af111abf025f1732cab333f96f285d6a93b934738a68a \ - --hash=sha256:cb643284ab0ed26f6957d969fe0dd8bb17beb567beb8998140b5e38a90974f6c \ - --hash=sha256:d275a9e3c81b1093c060c3837e580c37f47c51eca031f7b5fb76f7b8470f5f9b \ - --hash=sha256:e4141c5a32b5e37905b5940aacbc59739f036930367d7acce7a64e4dec1f5e0b - # via mashumaro -networkx==3.5 \ - --hash=sha256:0030d386a9a06dee3565298b4a734b68589749a544acbb6c412dc9e2489ec6ec \ - --hash=sha256:d4c6f9cf81f52d69230866796b82afbccdec3db7ae4fbd1b65ea750feed50037 - # via dbt-core -ordered-set==4.1.0 \ - --hash=sha256:046e1132c71fcf3330438a539928932caf51ddbc582496833e23de611de14562 \ - --hash=sha256:694a8e44c87657c59292ede72891eb91d34131f6531463aab3009191c77364a8 - # via deepdiff -packaging==25.0 \ - --hash=sha256:29572ef2b1f17581046b3a2227d5c611fb25ec70ca1ba8554b24b0e69331a484 \ - --hash=sha256:d443872c98d677bf60f6a1f2f8c1cb748e8fe762d2bf9d3148b5599295b0fc4f - # via - # dbt-core - # snowflake-connector-python -parsedatetime==2.6 \ - --hash=sha256:4cb368fbb18a0b7231f4d76119165451c8d2e35951455dfee97c62a87b04d455 \ - --hash=sha256:cb96edd7016872f58479e35879294258c71437195760746faffedb692aef000b - # via agate -pathspec==0.12.1 \ - --hash=sha256:a0d503e138a4c123b27490a4f7beda6a01c6f288df0e4a8b79c7eb0dc7b4cc08 \ - --hash=sha256:a482d51503a1ab33b1c67a6c3813a26953dbdc71c31dacaef9a838c4e29f5712 - # via - # dbt-common - # dbt-core -pendulum==3.1.0 \ - --hash=sha256:006758e2125da2e624493324dfd5d7d1b02b0c44bc39358e18bf0f66d0767f5f \ - --hash=sha256:1ce26a608e1f7387cd393fba2a129507c4900958d4f47b90757ec17656856571 \ - --hash=sha256:20f74aa8029a42e327bfc150472e0e4d2358fa5d795f70460160ba81b94b6945 \ - --hash=sha256:2404a6a54c80252ea393291f0b7f35525a61abae3d795407f34e118a8f133a18 \ - --hash=sha256:28658b0baf4b30eb31d096a375983cfed033e60c0a7bbe94fa23f06cd779b50b \ - --hash=sha256:350cabb23bf1aec7c7694b915d3030bff53a2ad4aeabc8c8c0d807c8194113d6 \ - --hash=sha256:42959341e843077c41d47420f28c3631de054abd64da83f9b956519b5c7a06a7 \ - --hash=sha256:4dfd53e7583ccae138be86d6c0a0b324c7547df2afcec1876943c4d481cf9608 \ - --hash=sha256:5553ac27be05e997ec26d7f004cf72788f4ce11fe60bb80dda604a64055b29d0 \ - --hash=sha256:66f96303560f41d097bee7d2dc98ffca716fbb3a832c4b3062034c2d45865015 \ - --hash=sha256:6a6e06a28f3a7d696546347805536f6f38be458cb79de4f80754430696bea9e6 \ - --hash=sha256:7378084fe54faab4ee481897a00b710876f2e901ded6221671e827a253e643f2 \ - --hash=sha256:7e68d6a51880708084afd8958af42dc8c5e819a70a6c6ae903b1c4bfc61e0f25 \ - --hash=sha256:8539db7ae2c8da430ac2515079e288948c8ebf7eb1edd3e8281b5cdf433040d6 \ - --hash=sha256:94751c52f6b7c306734d1044c2c6067a474237e1e5afa2f665d1fbcbbbcf24b3 \ - --hash=sha256:9e3f1e5da39a7ea7119efda1dd96b529748c1566f8a983412d0908455d606942 \ - --hash=sha256:b114dcb99ce511cb8f5495c7b6f0056b2c3dba444ef1ea6e48030d7371bd531a \ - --hash=sha256:cf6229e5ee70c2660148523f46c472e677654d0097bec010d6730f08312a4931 \ - --hash=sha256:d06999790d9ee9962a1627e469f98568bf7ad1085553fa3c30ed08b3944a14d7 \ - --hash=sha256:e9af1e5eeddb4ebbe1b1c9afb9fd8077d73416ade42dd61264b3f3b87742e0bb \ - --hash=sha256:f8dee234ca6142bf0514368d01a72945a44685aaa2fc4c14c98d09da9437b620 \ - --hash=sha256:f9178c2a8e291758ade1e8dd6371b1d26d08371b4c7730a6e9a3ef8b16ebae0f - # via fa-dae2-lab-m03w02 -platformdirs==4.4.0 \ - --hash=sha256:abd01743f24e5287cd7a5db3752faf1a2d65353f38ec26d98e25a6db65958c85 \ - --hash=sha256:ca753cf4d81dc309bc67b0ea38fd15dc97bc30ce419a7f58d13eb3bf14c4febf - # via snowflake-connector-python -protobuf==6.31.1 ; python_full_version < '3.14' \ - --hash=sha256:426f59d2964864a1a366254fa703b8632dcec0790d8862d30034d8245e1cd447 \ - --hash=sha256:4ee898bf66f7a8b0bd21bce523814e6fbd8c6add948045ce958b73af7e8878c6 \ - --hash=sha256:6f1227473dc43d44ed644425268eb7c2e488ae245d51c6866d19fe158e207402 \ - --hash=sha256:720a6c7e6b77288b85063569baae8536671b39f15cc22037ec7045658d80489e \ - --hash=sha256:7fa17d5a29c2e04b7d90e5e32388b8bfd0e7107cd8e616feef7ed3fa6bdab5c9 \ - --hash=sha256:a40fc12b84c154884d7d4c4ebd675d5b3b5283e155f324049ae396b95ddebc39 \ - --hash=sha256:d8cac4c982f0b957a4dc73a80e2ea24fab08e679c0de9deb835f4a12d69aca9a - # via - # dbt-adapters - # dbt-common - # dbt-core - # dbt-protos -protobuf==6.32.0 ; python_full_version >= '3.14' \ - --hash=sha256:501fe6372fd1c8ea2a30b4d9be8f87955a64d6be9c88a973996cef5ef6f0abf1 \ - --hash=sha256:75a2aab2bd1aeb1f5dc7c5f33bcb11d82ea8c055c9becbb41c26a8c43fd7092c \ - --hash=sha256:84f9e3c1ff6fb0308dbacb0950d8aa90694b0d0ee68e75719cb044b7078fe741 \ - --hash=sha256:a81439049127067fc49ec1d36e25c6ee1d1a2b7be930675f919258d03c04e7d2 \ - --hash=sha256:a8bdbb2f009cfc22a36d031f22a625a38b615b5e19e558a7b756b3279723e68e \ - --hash=sha256:ba377e5b67b908c8f3072a57b63e2c6a4cbd18aea4ed98d2584350dbf46f2783 \ - --hash=sha256:d52691e5bee6c860fff9a1c86ad26a13afbeb4b168cd4445c922b7e2cf85aaf0 - # via - # dbt-adapters - # dbt-common - # dbt-core - # dbt-protos -psycopg==3.2.9 \ - --hash=sha256:01a8dadccdaac2123c916208c96e06631641c0566b22005493f09663c7a8d3b6 \ - --hash=sha256:2fbb46fcd17bc81f993f28c47f1ebea38d66ae97cc2dbc3cad73b37cefbff700 - # via fa-dae2-lab-m03w02 -psycopg-binary==3.2.9 ; implementation_name != 'pypy' \ - --hash=sha256:08bf9d5eabba160dd4f6ad247cf12f229cc19d2458511cab2eb9647f42fa6795 \ - --hash=sha256:0e8aeefebe752f46e3c4b769e53f1d4ad71208fe1150975ef7662c22cca80fab \ - --hash=sha256:14f64d1ac6942ff089fc7e926440f7a5ced062e2ed0949d7d2d680dc5c00e2d4 \ - --hash=sha256:1b2cf018168cad87580e67bdde38ff5e51511112f1ce6ce9a8336871f465c19a \ - --hash=sha256:2290bc146a1b6a9730350f695e8b670e1d1feb8446597bed0bbe7c3c30e0abcb \ - --hash=sha256:25ab464bfba8c401f5536d5aa95f0ca1dd8257b5202eede04019b4415f491351 \ - --hash=sha256:52e239cd66c4158e412318fbe028cd94b0ef21b0707f56dcb4bdc250ee58fd40 \ - --hash=sha256:5be8292d07a3ab828dc95b5ee6b69ca0a5b2e579a577b39671f4f5b47116dfd2 \ - --hash=sha256:61d0a6ceed8f08c75a395bc28cb648a81cf8dee75ba4650093ad1a24a51c8724 \ - --hash=sha256:6a76b4722a529390683c0304501f238b365a46b1e5fb6b7249dbc0ad6fea51a0 \ - --hash=sha256:72691a1615ebb42da8b636c5ca9f2b71f266be9e172f66209a361c175b7842c5 \ - --hash=sha256:76eddaf7fef1d0994e3d536ad48aa75034663d3a07f6f7e3e601105ae73aeff6 \ - --hash=sha256:778588ca9897b6c6bab39b0d3034efff4c5438f5e3bd52fda3914175498202f9 \ - --hash=sha256:7a838852e5afb6b4126f93eb409516a8c02a49b788f4df8b6469a40c2157fa21 \ - --hash=sha256:7fc2915949e5c1ea27a851f7a472a7da7d0a40d679f0a31e42f1022f3c562e87 \ - --hash=sha256:96a551e4683f1c307cfc3d9a05fec62c00a7264f320c9962a67a543e3ce0d8ff \ - --hash=sha256:98bbe35b5ad24a782c7bf267596638d78aa0e87abc7837bdac5b2a2ab954179e \ - --hash=sha256:a1fa38a4687b14f517f049477178093c39c2a10fdcced21116f47c017516498f \ - --hash=sha256:ad280bbd409bf598683dda82232f5215cfc5f2b1bf0854e409b4d0c44a113b1d \ - --hash=sha256:b7e4e4dd177a8665c9ce86bc9caae2ab3aa9360b7ce7ec01827ea1baea9ff748 \ - --hash=sha256:be7d650a434921a6b1ebe3fff324dbc2364393eb29d7672e638ce3e21076974e \ - --hash=sha256:f0d5b3af045a187aedbd7ed5fc513bd933a97aaff78e61c3745b330792c4345b - # via psycopg -pycparser==2.22 \ - --hash=sha256:491c8be9c040f5390f5bf44a5b07752bd07f56edf992381b05c701439eec10f6 \ - --hash=sha256:c3702b6d3dd8c7abc1afa565d7e63d53a1d0bd86cdc24edd75470f4de499cfcc - # via cffi -pydantic==2.11.7 \ - --hash=sha256:d989c3c6cb79469287b1569f7447a17848c998458d49ebe294e975b9baf0f0db \ - --hash=sha256:dde5df002701f6de26248661f6835bbe296a47bf73990135c7d07ce741b9623b - # via - # dbt-core - # dbt-semantic-interfaces -pydantic-core==2.33.2 \ - --hash=sha256:04a1a413977ab517154eebb2d326da71638271477d6ad87a769102f7c2488c56 \ - --hash=sha256:0a9f2c9dd19656823cb8250b0724ee9c60a82f3cdf68a080979d13092a3b0fef \ - --hash=sha256:0fb2d542b4d66f9470e8065c5469ec676978d625a8b7a363f07d9a501a9cb36a \ - --hash=sha256:1082dd3e2d7109ad8b7da48e1d4710c8d06c253cbc4a27c1cff4fbcaa97a9e3f \ - --hash=sha256:1ea40a64d23faa25e62a70ad163571c0b342b8bf66d5fa612ac0dec4f069d916 \ - --hash=sha256:2b0a451c263b01acebe51895bfb0e1cc842a5c666efe06cdf13846c7418caa9a \ - --hash=sha256:3c6db6e52c6d70aa0d00d45cdb9b40f0433b96380071ea80b09277dba021ddf7 \ - --hash=sha256:4e61206137cbc65e6d5256e1166f88331d3b6238e082d9f74613b9b765fb9025 \ - --hash=sha256:52fb90784e0a242bb96ec53f42196a17278855b0f31ac7c3cc6f5c1ec4811849 \ - --hash=sha256:572c7e6c8bb4774d2ac88929e3d1f12bc45714ae5ee6d9a788a9fb35e60bb04b \ - --hash=sha256:5c92edd15cd58b3c2d34873597a1e20f13094f59cf88068adb18947df5455b4e \ - --hash=sha256:5f483cfb75ff703095c59e365360cb73e00185e01aaea067cd19acffd2ab20ea \ - --hash=sha256:61c18fba8e5e9db3ab908620af374db0ac1baa69f0f32df4f61ae23f15e586ac \ - --hash=sha256:65132b7b4a1c0beded5e057324b7e16e10910c106d43675d9bd87d4f38dde162 \ - --hash=sha256:7cb8bc3605c29176e1b105350d2e6474142d7c1bd1d9327c4a9bdb46bf827acc \ - --hash=sha256:8f57a69461af2a5fa6e6bbd7a5f60d3b7e6cebb687f55106933188e79ad155c1 \ - --hash=sha256:95237e53bb015f67b63c91af7518a62a8660376a6a0db19b89acc77a4d6199f5 \ - --hash=sha256:96081f1605125ba0855dfda83f6f3df5ec90c61195421ba72223de35ccfb2f88 \ - --hash=sha256:9cb1da0f5a471435a7bc7e439b8a728e8b61e59784b2af70d7c169f8dd8ae290 \ - --hash=sha256:9fdac5d6ffa1b5a83bca06ffe7583f5576555e6c8b3a91fbd25ea7780f825f7d \ - --hash=sha256:a7ec89dc587667f22b6a0b6579c249fca9026ce7c333fc142ba42411fa243cdc \ - --hash=sha256:c083a3bdd5a93dfe480f1125926afcdbf2917ae714bdb80b36d34318b2bec5d9 \ - --hash=sha256:c2fc0a768ef76c15ab9238afa6da7f69895bb5d1ee83aeea2e3509af4472d0b9 \ - --hash=sha256:c52b02ad8b4e2cf14ca7b3d918f3eb0ee91e63b3167c32591e57c4317e134f8f \ - --hash=sha256:c8e7af2f4e0194c22b5b37205bfb293d166a7344a5b0d0eaccebc376546d77d5 \ - --hash=sha256:cca3868ddfaccfbc4bfb1d608e2ccaaebe0ae628e1416aeb9c4d88c001bb45ab \ - --hash=sha256:db4b41f9bd95fbe5acd76d89920336ba96f03e149097365afe1cb092fceb89a1 \ - --hash=sha256:e80b087132752f6b3d714f041ccf74403799d3b23a72722ea2e6ba2e892555b9 \ - --hash=sha256:eb8c529b2819c37140eb51b914153063d27ed88e3bdc31b71198a198e921e011 \ - --hash=sha256:f517ca031dfc037a9c07e748cefd8d96235088b83b4f4ba8939105d20fa1dcd6 \ - --hash=sha256:f941635f2a3d96b2973e867144fde513665c87f13fe0e193c158ac51bfaaa7b2 \ - --hash=sha256:fa854f5cf7e33842a892e5c73f45327760bc7bc516339fda888c75ae60edaeb6 - # via pydantic -pyjwt==2.10.1 \ - --hash=sha256:3cc5772eb20009233caf06e9d8a0577824723b44e6648ee0a2aedb6cf9381953 \ - --hash=sha256:dcdd193e30abefd5debf142f9adfcdd2b58004e644f25406ffaebd50bd98dacb - # via snowflake-connector-python -pyopenssl==25.1.0 \ - --hash=sha256:2b11f239acc47ac2e5aca04fd7fa829800aeee22a2eb30d744572a157bd8a1ab \ - --hash=sha256:8d031884482e0c67ee92bf9a4d8cceb08d92aba7136432ffb0703c5280fc205b - # via snowflake-connector-python -python-dateutil==2.9.0.post0 \ - --hash=sha256:37dd54208da7e1cd875388217d5e00ebd4179249f90fb72437e91a35459a0ad3 \ - --hash=sha256:a8b2bc7bffae282281c8140a97d3aa9c14da0b136dfe83f850eea9a5f7470427 - # via - # botocore - # dbt-common - # dbt-semantic-interfaces - # pendulum -python-slugify==8.0.4 \ - --hash=sha256:276540b79961052b66b7d116620b36518847f52d5fd9e3a70164fc8c50faa6b8 \ - --hash=sha256:59202371d1d05b54a9e7720c5e038f928f45daaffe41dd10822f3907b937c856 - # via agate -pytimeparse==1.1.8 \ - --hash=sha256:04b7be6cc8bd9f5647a6325444926c3ac34ee6bc7e69da4367ba282f076036bd \ - --hash=sha256:e86136477be924d7e670646a98561957e8ca7308d44841e21f5ddea757556a0a - # via agate -pytz==2025.2 \ - --hash=sha256:360b9e3dbb49a209c21ad61809c7fb453643e048b38924c765813546746e81c3 \ - --hash=sha256:5ddf76296dd8c44c26eb8f4b6f35488f3ccbf6fbbd7adee0b7262d43f0ec2f00 - # via - # dbt-adapters - # dbt-core - # snowflake-connector-python -pywin32-ctypes==0.2.3 ; sys_platform == 'win32' \ - --hash=sha256:8a1513379d709975552d202d942d9837758905c8d01eb82b8bcc30918929e7b8 \ - --hash=sha256:d162dc04946d704503b2edc4d55f3dba5c1d539ead017afa00142c38b9885755 - # via keyring -pyyaml==6.0.2 ; python_full_version < '3.14' \ - --hash=sha256:0833f8694549e586547b576dcfaba4a6b55b9e96098b36cdc7ebefe667dfed48 \ - --hash=sha256:0ffe8360bab4910ef1b9e87fb812d8bc0a308b0d0eef8c8f44e0254ab3b07133 \ - --hash=sha256:17e311b6c678207928d649faa7cb0d7b4c26a0ba73d41e99c4fff6b6c3276484 \ - --hash=sha256:1f71ea527786de97d1a0cc0eacd1defc0985dcf6b3f17bb77dcfc8c34bec4dc5 \ - --hash=sha256:41e4e3953a79407c794916fa277a82531dd93aad34e29c2a514c2c0c5fe971cc \ - --hash=sha256:50187695423ffe49e2deacb8cd10510bc361faac997de9efef88badc3bb9e2d1 \ - --hash=sha256:68ccc6023a3400877818152ad9a1033e3db8625d899c72eacb5a668902e4d652 \ - --hash=sha256:70b189594dbe54f75ab3a1acec5f1e3faa7e8cf2f1e08d9b561cb41b845f69d5 \ - --hash=sha256:7e7401d0de89a9a855c839bc697c079a4af81cf878373abd7dc625847d25cbd8 \ - --hash=sha256:80bab7bfc629882493af4aa31a4cfa43a4c57c83813253626916b8c7ada83476 \ - --hash=sha256:8388ee1976c416731879ac16da0aff3f63b286ffdd57cdeb95f3f2e085687563 \ - --hash=sha256:8b9c7197f7cb2738065c481a0461e50ad02f18c78cd75775628afb4d7137fb3b \ - --hash=sha256:9b22676e8097e9e22e36d6b7bda33190d0d400f345f23d4065d48f4ca7ae0425 \ - --hash=sha256:bc2fa7c6b47d6bc618dd7fb02ef6fdedb1090ec036abab80d4681424b84c1183 \ - --hash=sha256:c70c95198c015b85feafc136515252a261a84561b7b1d51e3384e0655ddf25ab \ - --hash=sha256:ce826d6ef20b1bc864f0a68340c8b3287705cae2f8b4b1d932177dcc76721725 \ - --hash=sha256:d584d9ec91ad65861cc08d42e834324ef890a082e591037abe114850ff7bbc3e \ - --hash=sha256:ef6107725bd54b262d6dedcc2af448a266975032bc85ef0172c5f059da6325b4 \ - --hash=sha256:efdca5630322a10774e8e98e1af481aad470dd62c3170801852d752aa7a783ba - # via - # dbt-core - # dbt-semantic-interfaces -pyyaml==6.0.3 ; python_full_version >= '3.14' \ - --hash=sha256:00c4bdeba853cc34e7dd471f16b4114f4162dc03e6b7afcc2128711f0eca823c \ - --hash=sha256:02893d100e99e03eda1c8fd5c441d8c60103fd175728e23e431db1b589cf5ab3 \ - --hash=sha256:0f29edc409a6392443abf94b9cf89ce99889a1dd5376d94316ae5145dfedd5d6 \ - --hash=sha256:16249ee61e95f858e83976573de0f5b2893b3677ba71c9dd36b9cf8be9ac6d65 \ - --hash=sha256:2283a07e2c21a2aa78d9c4442724ec1eb15f5e42a723b99cb3d822d48f5f7ad1 \ - --hash=sha256:34d5fcd24b8445fadc33f9cf348c1047101756fd760b4dacb5c3e99755703310 \ - --hash=sha256:41715c910c881bc081f1e8872880d3c650acf13dfa8214bad49ed4cede7c34ea \ - --hash=sha256:4a2e8cebe2ff6ab7d1050ecd59c25d4c8bd7e6f400f5f82b96557ac0abafd0ac \ - --hash=sha256:4ad1906908f2f5ae4e5a8ddfce73c320c2a1429ec52eafd27138b7f1cbe341c9 \ - --hash=sha256:501a031947e3a9025ed4405a168e6ef5ae3126c59f90ce0cd6f2bfc477be31b7 \ - --hash=sha256:5190d403f121660ce8d1d2c1bb2ef1bd05b5f68533fc5c2ea899bd15f4399b35 \ - --hash=sha256:5498cd1645aa724a7c71c8f378eb29ebe23da2fc0d7a08071d89469bf1d2defb \ - --hash=sha256:5fcd34e47f6e0b794d17de1b4ff496c00986e1c83f7ab2fb8fcfe9616ff7477b \ - --hash=sha256:5fdec68f91a0c6739b380c83b951e2c72ac0197ace422360e6d5a959d8d97b2c \ - --hash=sha256:64386e5e707d03a7e172c0701abfb7e10f0fb753ee1d773128192742712a98fd \ - --hash=sha256:66e1674c3ef6f541c35191caae2d429b967b99e02040f5ba928632d9a7f0f065 \ - --hash=sha256:6adc77889b628398debc7b65c073bcb99c4a0237b248cacaf3fe8a557563ef6c \ - --hash=sha256:79005a0d97d5ddabfeeea4cf676af11e647e41d81c9a7722a193022accdb6b7c \ - --hash=sha256:7c6610def4f163542a622a73fb39f534f8c101d690126992300bf3207eab9764 \ - --hash=sha256:7f047e29dcae44602496db43be01ad42fc6f1cc0d8cd6c83d342306c32270196 \ - --hash=sha256:8d1fab6bb153a416f9aeb4b8763bc0f22a5586065f86f7664fc23339fc1c1fac \ - --hash=sha256:8da9669d359f02c0b91ccc01cac4a67f16afec0dac22c2ad09f46bee0697eba8 \ - --hash=sha256:8dc52c23056b9ddd46818a57b78404882310fb473d63f17b07d5c40421e47f8e \ - --hash=sha256:9149cad251584d5fb4981be1ecde53a1ca46c891a79788c0df828d2f166bda28 \ - --hash=sha256:93dda82c9c22deb0a405ea4dc5f2d0cda384168e466364dec6255b293923b2f3 \ - --hash=sha256:96b533f0e99f6579b3d4d4995707cf36df9100d67e0c8303a0c55b27b5f99bc5 \ - --hash=sha256:a33284e20b78bd4a18c8c2282d549d10bc8408a2a7ff57653c0cf0b9be0afce5 \ - --hash=sha256:a80cb027f6b349846a3bf6d73b5e95e782175e52f22108cfa17876aaeff93702 \ - --hash=sha256:b3bc83488de33889877a0f2543ade9f70c67d66d9ebb4ac959502e12de895788 \ - --hash=sha256:ba1cc08a7ccde2d2ec775841541641e4548226580ab850948cbfda66a1befcdc \ - --hash=sha256:c1ff362665ae507275af2853520967820d9124984e0f7466736aea23d8611fba \ - --hash=sha256:c458b6d084f9b935061bc36216e8a69a7e293a2f1e68bf956dcd9e6cbcd143f5 \ - --hash=sha256:d0eae10f8159e8fdad514efdc92d74fd8d682c933a6dd088030f3834bc8e6b26 \ - --hash=sha256:d76623373421df22fb4cf8817020cbb7ef15c725b9d5e45f17e189bfc384190f \ - --hash=sha256:ebc55a14a21cb14062aa4162f906cd962b28e2e9ea38f9b4391244cd8de4ae0b \ - --hash=sha256:eda16858a3cab07b80edaf74336ece1f986ba330fdb8ee0d6c0d68fe82bc96be \ - --hash=sha256:ee2922902c45ae8ccada2c5b501ab86c36525b883eff4255313a253a3160861c \ - --hash=sha256:f7057c9a337546edc7973c0d3ba84ddcdf0daa14533c2065749c9075001090e6 \ - --hash=sha256:fc09d0aa354569bc501d4e787133afc08552722d3ab34836a80547331bb5d4a0 - # via - # dbt-core - # dbt-semantic-interfaces -referencing==0.36.2 \ - --hash=sha256:df2e89862cd09deabbdba16944cc3f10feb6b3e6f18e902f7cc25609a34775aa \ - --hash=sha256:e8699adbbf8b5c7de96d8ffa0eb5c158b3beafce084968e2ea8bb08c6794dcd0 - # via - # jsonschema - # jsonschema-specifications -requests==2.32.5 \ - --hash=sha256:2462f94637a34fd532264295e186976db0f5d453d1cdd31473c85a6a161affb6 \ - --hash=sha256:dbba0bac56e100853db0ea71b82b4dfd5fe2bf6d3754a8893c3af500cec7d7cf - # via - # dbt-common - # dbt-core - # snowflake-connector-python - # snowplow-tracker -rpds-py==0.27.1 \ - --hash=sha256:08f1e20bccf73b08d12d804d6e1c22ca5530e71659e6673bce31a6bb71c1e73f \ - --hash=sha256:0b08d152555acf1f455154d498ca855618c1378ec810646fcd7c76416ac6dc60 \ - --hash=sha256:0dc5dceeaefcc96dc192e3a80bbe1d6c410c469e97bdd47494a7d930987f18b2 \ - --hash=sha256:12ed005216a51b1d6e2b02a7bd31885fe317e45897de81d86dcce7d74618ffff \ - --hash=sha256:134fae0e36022edad8290a6661edf40c023562964efea0cc0ec7f5d392d2aaef \ - --hash=sha256:13e608ac9f50a0ed4faec0e90ece76ae33b34c0e8656e3dceb9a7db994c692cd \ - --hash=sha256:1441811a96eadca93c517d08df75de45e5ffe68aa3089924f963c782c4b898cf \ - --hash=sha256:15d3b4d83582d10c601f481eca29c3f138d44c92187d197aff663a269197c02d \ - --hash=sha256:16323f674c089b0360674a4abd28d5042947d54ba620f72514d69be4ff64845e \ - --hash=sha256:1b207d881a9aef7ba753d69c123a35d96ca7cb808056998f6b9e8747321f03b8 \ - --hash=sha256:2643400120f55c8a96f7c9d858f7be0c88d383cd4653ae2cf0d0c88f668073e5 \ - --hash=sha256:26a1c73171d10b7acccbded82bf6a586ab8203601e565badc74bbbf8bc5a10f8 \ - --hash=sha256:2efe4eb1d01b7f5f1939f4ef30ecea6c6b3521eec451fb93191bf84b2a522418 \ - --hash=sha256:3020724ade63fe320a972e2ffd93b5623227e684315adce194941167fee02688 \ - --hash=sha256:33aa65b97826a0e885ef6e278fbd934e98cdcfed80b63946025f01e2f5b29502 \ - --hash=sha256:3ce0cac322b0d69b63c9cdb895ee1b65805ec9ffad37639f291dd79467bee675 \ - --hash=sha256:41e532bbdcb57c92ba3be62c42e9f096431b4cf478da9bc3bc6ce5c38ab7ba7a \ - --hash=sha256:42a89282d711711d0a62d6f57d81aa43a1368686c45bc1c46b7f079d55692734 \ - --hash=sha256:466bfe65bd932da36ff279ddd92de56b042f2266d752719beb97b08526268ec5 \ - --hash=sha256:47162fdab9407ec3f160805ac3e154df042e577dd53341745fc7fb3f625e6d92 \ - --hash=sha256:4ed2e16abbc982a169d30d1a420274a709949e2cbdef119fe2ec9d870b42f274 \ - --hash=sha256:4f75e4bd8ab8db624e02c8e2fc4063021b58becdbe6df793a8111d9343aec1e3 \ - --hash=sha256:55266dafa22e672f5a4f65019015f90336ed31c6383bd53f5e7826d21a0e0b83 \ - --hash=sha256:62f85b665cedab1a503747617393573995dac4600ff51869d69ad2f39eb5e817 \ - --hash=sha256:639fd5efec029f99b79ae47e5d7e00ad8a773da899b6309f6786ecaf22948c48 \ - --hash=sha256:6567d2bb951e21232c2f660c24cf3470bb96de56cdcb3f071a83feeaff8a2772 \ - --hash=sha256:67ce7620704745881a3d4b0ada80ab4d99df390838839921f99e63c474f82cf2 \ - --hash=sha256:6f5b7bd8e219ed50299e58551a410b64daafb5017d54bbe822e003856f06a802 \ - --hash=sha256:7ba32c16b064267b22f1850a34051121d423b6f7338a12b9459550eb2096e7ec \ - --hash=sha256:7ee6521b9baf06085f62ba9c7a3e5becffbc32480d2f1b351559c001c38ce4c1 \ - --hash=sha256:80c60cfb5310677bd67cb1e85a1e8eb52e12529545441b43e6f14d90b878775a \ - --hash=sha256:819064fa048ba01b6dadc5116f3ac48610435ac9a0058bbde98e569f9e785c39 \ - --hash=sha256:84f7d509870098de0e864cad0102711c1e24e9b1a50ee713b65928adb22269e4 \ - --hash=sha256:8ee50c3e41739886606388ba3ab3ee2aae9f35fb23f833091833255a31740797 \ - --hash=sha256:92022bbbad0d4426e616815b16bc4127f83c9a74940e1ccf3cfe0b387aba0228 \ - --hash=sha256:9a1f4814b65eacac94a00fc9a526e3fdafd78e439469644032032d0d63de4881 \ - --hash=sha256:9d992ac10eb86d9b6f369647b6a3f412fc0075cfd5d799530e84d335e440a002 \ - --hash=sha256:a512c8263249a9d68cac08b05dd59d2b3f2061d99b322813cbcc14c3c7421998 \ - --hash=sha256:a6e57b0abfe7cc513450fcf529eb486b6e4d3f8aee83e92eb5f1ef848218d456 \ - --hash=sha256:a75f305c9b013289121ec0f1181931975df78738cdf650093e6b86d74aa7d8dd \ - --hash=sha256:a9e960fc78fecd1100539f14132425e1d5fe44ecb9239f8f27f079962021523e \ - --hash=sha256:acb9aafccaae278f449d9c713b64a9e68662e7799dbd5859e2c6b3c67b56d334 \ - --hash=sha256:ae2775c1973e3c30316892737b91f9283f9908e3cc7625b9331271eaaed7dc90 \ - --hash=sha256:ae92443798a40a92dc5f0b01d8a7c93adde0c4dc965310a29ae7c64d72b9fad2 \ - --hash=sha256:b6dfb0e058adb12d8b1d1b25f686e94ffa65d9995a5157afe99743bf7369d62b \ - --hash=sha256:b7fb801aa7f845ddf601c49630deeeccde7ce10065561d92729bfe81bd21fb33 \ - --hash=sha256:ba81d2b56b6d4911ce735aad0a1d4495e808b8ee4dc58715998741a26874e7c2 \ - --hash=sha256:bf876e79763eecf3e7356f157540d6a093cef395b65514f17a356f62af6cc136 \ - --hash=sha256:c1476d6f29eb81aa4151c9a31219b03f1f798dc43d8af1250a870735516a1212 \ - --hash=sha256:c46c9dd2403b66a2a3b9720ec4b74d4ab49d4fabf9f03dfdce2d42af913fe8d0 \ - --hash=sha256:cf9931f14223de59551ab9d38ed18d92f14f055a5f78c1d8ad6493f735021bbb \ - --hash=sha256:d5fa0ee122dc09e23607a28e6d7b150da16c662e66409bbe85230e4c85bb528a \ - --hash=sha256:d76f9cc8665acdc0c9177043746775aa7babbf479b5520b78ae4002d889f5c21 \ - --hash=sha256:d78827d7ac08627ea2c8e02c9e5b41180ea5ea1f747e9db0915e3adf36b62dcf \ - --hash=sha256:d9199717881f13c32c4046a15f024971a3b78ad4ea029e8da6b86e5aa9cf4594 \ - --hash=sha256:dce51c828941973a5684d458214d3a36fcd28da3e1875d659388f4f9f12cc33e \ - --hash=sha256:dd2135527aa40f061350c3f8f89da2644de26cd73e4de458e79606384f4f68e7 \ - --hash=sha256:dfbfac137d2a3d0725758cd141f878bf4329ba25e34979797c89474a89a8a3a3 \ - --hash=sha256:e48af21883ded2b3e9eb48cb7880ad8598b31ab752ff3be6457001d78f416723 \ - --hash=sha256:e4b9fcfbc021633863a37e92571d6f91851fa656f0180246e84cbd8b3f6b329b \ - --hash=sha256:e5c20f33fd10485b80f65e800bbe5f6785af510b9f4056c5a3c612ebc83ba6cb \ - --hash=sha256:eb11a4f1b2b63337cfd3b4d110af778a59aae51c81d195768e353d8b52f88081 \ - --hash=sha256:ed090ccd235f6fa8bb5861684567f0a83e04f52dfc2e5c05f2e4b1309fcf85e7 \ - --hash=sha256:ed10dc32829e7d222b7d3b93136d25a406ba9788f6a7ebf6809092da1f4d279d \ - --hash=sha256:ee4308f409a40e50593c7e3bb8cbe0b4d4c66d1674a316324f0c2f5383b486f9 \ - --hash=sha256:f149826d742b406579466283769a8ea448eed82a789af0ed17b0cd5770433444 \ - --hash=sha256:f2729615f9d430af0ae6b36cf042cb55c0936408d543fb691e1a9e36648fd35a \ - --hash=sha256:f39f58a27cc6e59f432b568ed8429c7e1641324fbe38131de852cd77b2d534b0 \ - --hash=sha256:f9025faafc62ed0b75a53e541895ca272815bec18abe2249ff6501c8f2e12b83 \ - --hash=sha256:faf8d146f3d476abfee026c4ae3bdd9ca14236ae4e4c310cbd1cf75ba33d24a3 \ - --hash=sha256:fb89bec23fddc489e5d78b550a7b773557c9ab58b7946154a10a6f7a214a48b2 \ - --hash=sha256:fe0dd05afb46597b9a2e11c351e5e4283c741237e7f617ffb3252780cca9336a \ - --hash=sha256:fecc80cb2a90e28af8a9b366edacf33d7a91cbfe4c2c4544ea1246e949cfebeb \ - --hash=sha256:fed467af29776f6556250c9ed85ea5a4dd121ab56a5f8b206e3e7a4c551e48ec - # via - # jsonschema - # referencing -s3transfer==0.13.1 \ - --hash=sha256:a981aa7429be23fe6dfc13e80e4020057cbab622b08c0315288758d67cabc724 \ - --hash=sha256:c3fdba22ba1bd367922f27ec8032d6a1cf5f10c934fb5d68cf60fd5a23d936cf - # via boto3 -secretstorage==3.3.3 ; sys_platform == 'linux' \ - --hash=sha256:2403533ef369eca6d2ba81718576c5e0f564d5cca1b58f73a8b23e7d4eeebd77 \ - --hash=sha256:f356e6628222568e3af06f2eba8df495efa13b3b63081dafd4f7d9a7b7bc9f99 - # via keyring -six==1.17.0 \ - --hash=sha256:4721f391ed90541fddacab5acf947aa0d3dc7d27b2e1e8eda2be8970586c3274 \ - --hash=sha256:ff70335d468e7eb6ec65b95b99d3a2836546063f63acc5171de367e834932a81 - # via - # isodate - # python-dateutil -snowflake-connector-python==3.17.2 \ - --hash=sha256:0708de0f64e3c6789fdf0d75f22c55eb0215b60b125a5d7d18e48a1d5b2e7def \ - --hash=sha256:1691f5f7ff508b1fefc491f0fb85524165681bb29242f508731515057a1b4f9d \ - --hash=sha256:1a668e5e9ae04ec0ab0f7cdd450ef3757283f37856655ae7e971ee645a25b4b3 \ - --hash=sha256:360f21c576847636c3560a2607c5388f9e57e607a3558f5f5a188a71c82108e1 \ - --hash=sha256:4934a4f552876592696ab8def7e2d33163f99bb059bac4b90b5cfce165aa6499 \ - --hash=sha256:63a57cb67d14c7da6b91b8db0db3a92dfa5cf5082c388006d2f9f480c2df0234 \ - --hash=sha256:70f7cf3dcfcceea99d4b9d4fb8ef67788790832cbeceb222fbdf5595da52103b \ - --hash=sha256:9e05c9d55b234c8a903d96397b9eb4accbddf7d2fcfac2e27239f0f49d16f475 \ - --hash=sha256:c6f59c47e43bf889fd5a2ead8ffeacd447cd792d711f010c91b23b4804c67f6b \ - --hash=sha256:e21a68c5fb04a16f48fd5146ab7a72d17b2d3d09c9b537f64fffa99316abd670 \ - --hash=sha256:ed6a05c55238d076ad186e27889a72fe888f02e9c8c341127f828439cbf1e42f - # via - # dbt-snowflake - # fa-dae2-lab-m03w02 -snowplow-tracker==1.1.0 \ - --hash=sha256:24ea32ddac9cca547421bf9ab162f5f33c00711c6ef118ad5f78093cee962224 \ - --hash=sha256:95d8fdc8bd542fd12a0b9a076852239cbaf0599eda8721deaf5f93f7138fe755 - # via dbt-core -sortedcontainers==2.4.0 \ - --hash=sha256:25caa5a06cc30b6b83d11423433f65d1f9d76c4c6a0c90e3379eaa43b9bfdb88 \ - --hash=sha256:a163dcaede0f1c021485e957a39245190e74249897e2ae4b2aa38595db237ee0 - # via snowflake-connector-python -sqlparse==0.5.3 \ - --hash=sha256:09f67787f56a0b16ecdbde1bfc7f5d9c3371ca683cfeaa8e6ff60b4807ec9272 \ - --hash=sha256:cf2196ed3418f3ba5de6af7e82c694a9fbdbfecccdfc72e281548517081f16ca - # via dbt-core -text-unidecode==1.3 \ - --hash=sha256:1311f10e8b895935241623731c2ba64f4c455287888b18189350b67134a822e8 \ - --hash=sha256:bad6603bb14d279193107714b288be206cac565dfa49aa5b105294dd5c4aab93 - # via python-slugify -tomlkit==0.13.3 \ - --hash=sha256:430cf247ee57df2b94ee3fbe588e71d362a941ebb545dec29b53961d61add2a1 \ - --hash=sha256:c89c649d79ee40629a9fda55f8ace8c6a1b42deb912b2a8fd8d942ddadb606b0 - # via snowflake-connector-python -typing-extensions==4.15.0 \ - --hash=sha256:0cea48d173cc12fa28ecabc3b837ea3cf6f38c6d1136f85cbaaf598984861466 \ - --hash=sha256:f0fa19c6845758ab08074a0cfa8b7aecb71c999ca73d62883bc25cc018c4e548 - # via - # dbt-adapters - # dbt-common - # dbt-core - # dbt-semantic-interfaces - # mashumaro - # psycopg - # pydantic - # pydantic-core - # pyopenssl - # referencing - # snowflake-connector-python - # snowplow-tracker - # typing-inspection -typing-inspection==0.4.1 \ - --hash=sha256:389055682238f53b04f7badcb49b989835495a96700ced5dab2d8feae4b26f51 \ - --hash=sha256:6ae134cc0203c33377d43188d4064e9b357dba58cff3185f22924610e70a9d28 - # via pydantic -tzdata==2025.2 \ - --hash=sha256:1a403fada01ff9221ca8044d701868fa132215d84beb92242d9acd2147f667a8 \ - --hash=sha256:b60a638fcc0daffadf82fe0f57e53d06bdec2f36c4df66280ae79bce6bd6f2b9 - # via - # agate - # pendulum - # psycopg -urllib3==2.5.0 \ - --hash=sha256:3fc47733c7e419d4bc3f6b3dc2b4f890bb743906a30d56ba4a5bfa4bbff92760 \ - --hash=sha256:e6b01673c0fa6a13e374b50871808eb3bf7046c4b125b216f6bf1cc604cff0dc - # via - # botocore - # requests -zipp==3.23.0 \ - --hash=sha256:071652d6115ed432f5ce1d34c336c0adfd6a884660d1e9712a256d3d3bd4b14e \ - --hash=sha256:a07157588a12518c9d4034df3fbbee09c814741a33ff63c05fa29d26a2404166 - # via importlib-metadata \ No newline at end of file diff --git a/airflow/requirements.txt b/airflow/requirements.txt index 99f2ef1..627903f 100644 --- a/airflow/requirements.txt +++ b/airflow/requirements.txt @@ -214,6 +214,37 @@ filelock==3.19.1 \ --hash=sha256:66eda1888b0171c998b35be2bcc0f6d75c388a7ce20c3f3f37aa8e96c2dddf58 \ --hash=sha256:d38e30481def20772f5baf097c122c3babc4fcdb7e14e57049eb9d88c6dc017d # via snowflake-connector-python +hypersync==0.8.5 \ + --hash=sha256:19523bc2d7f1d49f60f8bceb9650de086dbcd0542c3d475783f61f5cd2963d29 \ + --hash=sha256:1a259acbbf84d3b6ee8b5700ae54335c15217b4e5191c494d292dea1ac8989ff \ + --hash=sha256:1f60eede2996000f751e187b18e15d5f59eca28a6590178da149181abd731c0f \ + --hash=sha256:317d9933425134bd6a07c0a3a41f168ff46b20cf9372624ea52bb93f7e9892da \ + --hash=sha256:342d27f6ca24fa17d93d3cc44f400f4765709bff186520c1b17450582329d45c \ + --hash=sha256:3a31b61135df6568f77bcce2302373d1cda186e8c3345aeebf863c854acdcc47 \ + --hash=sha256:3b00ac24e4091f378ad5afcc0f320b0790e60d3bdd604d6587aa89189b7c65d1 \ + --hash=sha256:473a7bc74abf03ea4e30934170e80882c81b68afcf8922f2622f0f340e2d6d42 \ + --hash=sha256:4813b1f33861582bd711f74ec66ad893e302ae24805584c2ff7b243a21fb1ebf \ + --hash=sha256:4f0f1ccca10bd87bbbed6d355a93106369f7d15f5c333799d2ccd9628e9ec27e \ + --hash=sha256:59cc577f00654a1dd8d1dc4a94ec32af8b84a9f707712e439a7df178c667cb4f \ + --hash=sha256:5d96826dd86244a9b6cbd9eab2e9a91ae82d02a2a0ca284e0a5462673fc1367f \ + --hash=sha256:6601b916a5769989b97d824acb5b0c114e5cc830574353c7ae63a4f0c31a91d6 \ + --hash=sha256:6e4b288500b519949a58da5b55334ab3da908b9b5e2fbf218ea43e0501c08c7a \ + --hash=sha256:7139491cf4ef7ba6a72a0ef8f8ed8674eb16f83067186092f5d50f3d28664b41 \ + --hash=sha256:716a7211e4287fe1155a336a22b14d5d41a4cb38b22a56fa57ab781e3fe7c251 \ + --hash=sha256:7673c6db771a8a1f9dbecc7b87218431ba7931719201a4cd23165e9610563be8 \ + --hash=sha256:80d9c4d9cd8b062092b7106fff36539986815c9f7b79debdb9cc8a8f2545a686 \ + --hash=sha256:828e50339244ec6cb5e61a6404df8757af139c8d000f1fc3a480ac171c313a68 \ + --hash=sha256:8826e2512ba20e525dde88ae1f67a9c9361dbe44eead4e9ded93b312c8ccdc01 \ + --hash=sha256:8e51e34b4f0d76c4a07a7c0ad0e54872a63c83de8bd7fde9157e8954429c7d21 \ + --hash=sha256:a28058f67cbbae69a4a554c1f1331d7ba4ac1716274e3e17787af4fb11e4c6fd \ + --hash=sha256:a339433a723b42e497149f894ffce923cf2f621c0e26cc70c88dbd7fd809759f \ + --hash=sha256:b53a12b4ac3aa573841107e74651692ea71e34f5a4f28e1581d8daf48f87568c \ + --hash=sha256:be25b58e6298b93d5675691d359e326929aa33a92d2849f73b5c0f281d1fa839 \ + --hash=sha256:c85cc6f570f10a23f99c857ec87758068e5739a9e249daa088ad799b0151a388 \ + --hash=sha256:f73ad0b8a4e505a9e4978f4117ccf0caa30dbaddcdb85cdebca395b9d4b597c9 \ + --hash=sha256:f8526e11f1b66f083e1ad42e71294f5092795f5132423d2c93a9559c211030b5 \ + --hash=sha256:fe39604525e8fb137584d99707effe06a3f547942c9396f08d1b1da6ba69a35e + # via stables-analytics idna==3.10 \ --hash=sha256:12f65c9b470abda6dc35cf8e63cc574b1c52b11df2c86030af0ac09b01b13ea9 \ --hash=sha256:946d195a0d259cbba61165e88e65941f16e9b36ea6ddb97f00452bae8b1287d3 @@ -406,6 +437,15 @@ platformdirs==4.4.0 \ --hash=sha256:abd01743f24e5287cd7a5db3752faf1a2d65353f38ec26d98e25a6db65958c85 \ --hash=sha256:ca753cf4d81dc309bc67b0ea38fd15dc97bc30ce419a7f58d13eb3bf14c4febf # via snowflake-connector-python +polars==1.33.1 \ + --hash=sha256:094a37d06789286649f654f229ec4efb9376630645ba8963b70cb9c0b008b3e1 \ + --hash=sha256:29200b89c9a461e6f06fc1660bc9c848407640ee30fe0e5ef4947cfd49d55337 \ + --hash=sha256:3881c444b0f14778ba94232f077a709d435977879c1b7d7bd566b55bd1830bb5 \ + --hash=sha256:444940646e76342abaa47f126c70e3e40b56e8e02a9e89e5c5d1c24b086db58a \ + --hash=sha256:c3cfddb3b78eae01a218222bdba8048529fef7e14889a71e33a5198644427642 \ + --hash=sha256:c9781c704432a2276a185ee25898aa427f39a904fbe8fde4ae779596cdbd7a9e \ + --hash=sha256:fa3fdc34eab52a71498264d6ff9b0aa6955eb4b0ae8add5d3cb43e4b84644007 + # via stables-analytics protobuf==6.31.1 ; python_full_version < '3.14' \ --hash=sha256:426f59d2964864a1a366254fa703b8632dcec0790d8862d30034d8245e1cd447 \ --hash=sha256:4ee898bf66f7a8b0bd21bce523814e6fbd8c6add948045ce958b73af7e8878c6 \ @@ -730,6 +770,10 @@ snowplow-tracker==1.1.0 \ --hash=sha256:24ea32ddac9cca547421bf9ab162f5f33c00711c6ef118ad5f78093cee962224 \ --hash=sha256:95d8fdc8bd542fd12a0b9a076852239cbaf0599eda8721deaf5f93f7138fe755 # via dbt-core +strenum==0.4.15 \ + --hash=sha256:878fb5ab705442070e4dd1929bb5e2249511c0bcf2b0eeacf3bcd80875c82eff \ + --hash=sha256:a30cda4af7cc6b5bf52c8055bc4bf4b2b6b14a93b574626da33df53cf7740659 + # via hypersync sortedcontainers==2.4.0 \ --hash=sha256:25caa5a06cc30b6b83d11423433f65d1f9d76c4c6a0c90e3379eaa43b9bfdb88 \ --hash=sha256:a163dcaede0f1c021485e957a39245190e74249897e2ae4b2aa38595db237ee0 @@ -783,4 +827,5 @@ urllib3==2.5.0 \ zipp==3.23.0 \ --hash=sha256:071652d6115ed432f5ce1d34c336c0adfd6a884660d1e9712a256d3d3bd4b14e \ --hash=sha256:a07157588a12518c9d4034df3fbbee09c814741a33ff63c05fa29d26a2404166 - # via importlib-metadata \ No newline at end of file + # via importlib-metadata + diff --git a/dbt_project/models/01_staging/models.yml b/dbt_project/models/01_staging/models.yml index 1c335b3..b61e1ac 100644 --- a/dbt_project/models/01_staging/models.yml +++ b/dbt_project/models/01_staging/models.yml @@ -34,13 +34,6 @@ models: description: "Second indexed parameter (hex string with 0x prefix)" - name: event_data description: "Non-indexed event parameters (hex string with 0x prefix)" - - name: _dlt_load_id - description: "DLT load identifier" - - name: _dlt_id - description: "DLT unique row identifier" - tests: - - unique - - not_null - name: stg_event_signature description: "Staging model for event signature lookup table" @@ -54,10 +47,3 @@ models: description: "Human-readable event name (e.g., Transfer, Approval)" - name: event_signature description: "Full event signature (e.g., Transfer(address,address,uint256))" - - name: _dlt_load_id - description: "DLT load identifier" - - name: _dlt_id - description: "DLT unique row identifier" - tests: - - unique - - not_null diff --git a/dbt_project/models/01_staging/sources.yml b/dbt_project/models/01_staging/sources.yml index c847d43..c8fb8e6 100644 --- a/dbt_project/models/01_staging/sources.yml +++ b/dbt_project/models/01_staging/sources.yml @@ -56,11 +56,4 @@ sources: - name: event_name description: "Human-readable event name (e.g., Transfer, Approval)" - name: signature - description: "Full event signature (e.g., Transfer(address,address,uint256))" - - name: _dlt_load_id - description: "DLT load identifier" - - name: _dlt_id - description: "DLT unique row identifier" - tests: - - unique - - not_null + description: "Full event signature (e.g., Transfer(address,address,uint256))" \ No newline at end of file diff --git a/dbt_project/models/02_intermediate/models.yml b/dbt_project/models/02_intermediate/models.yml index e5f4cc6..8168447 100644 --- a/dbt_project/models/02_intermediate/models.yml +++ b/dbt_project/models/02_intermediate/models.yml @@ -48,8 +48,6 @@ models: - not_null - name: amount_raw description: "Raw transfer amount as integer (before decimal adjustment)" - tests: - - not_null - name: amount description: "Transfer amount converted to decimal (assuming 18 decimals standard ERC20)" - name: event_name diff --git a/dbt_project/models/03_mart/models.yml b/dbt_project/models/03_mart/models.yml index c49952b..b656deb 100644 --- a/dbt_project/models/03_mart/models.yml +++ b/dbt_project/models/03_mart/models.yml @@ -56,12 +56,8 @@ models: - not_null - name: amount_raw description: "Raw minted amount as integer (before decimal adjustment)" - tests: - - not_null - name: minted_amount description: "Minted amount adjusted for decimals (assumes 18 decimals)" - tests: - - not_null - name: event_name description: "Event name (always 'Transfer')" - name: transaction_index @@ -102,12 +98,8 @@ models: - not_null - name: amount_raw description: "Raw burned amount as integer (before decimal adjustment)" - tests: - - not_null - name: burned_amount description: "Burned amount adjusted for decimals (assumes 18 decimals)" - tests: - - not_null - name: event_name description: "Event name (always 'Transfer')" - name: transaction_index diff --git a/dbt_project/profiles.yml b/dbt_project/profiles.yml index cbd57e9..c8d9f77 100644 --- a/dbt_project/profiles.yml +++ b/dbt_project/profiles.yml @@ -14,7 +14,7 @@ stables_analytics: type: snowflake account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}" user: "{{ env_var('SNOWFLAKE_USER') }}" - private_key_path: "{{ env_var('SNOWFLAKE_PRIVATE_KEY_FILE') }}" + private_key: "{{ env_var('SNOWFLAKE_PRIVATE_KEY_FILE') }}" role: "{{ env_var('SNOWFLAKE_ROLE') }}" database: "{{ env_var('SNOWFLAKE_DATABASE') }}" warehouse: "{{ env_var('SNOWFLAKE_WAREHOUSE') }}" diff --git a/docker-compose.airflow.yml b/docker-compose.airflow.yml index 0e5f2ba..241f4da 100644 --- a/docker-compose.airflow.yml +++ b/docker-compose.airflow.yml @@ -15,8 +15,8 @@ x-airflow-common: &airflow-common AIRFLOW__CORE__EXECUTION_API_SERVER_URL: "http://airflow-apiserver:8282/execution/" AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: "true" # Timezone Configuration - AIRFLOW__CORE__DEFAULT_TIMEZONE: Asia/Bangkok - TZ: Asia/Bangkok + AIRFLOW__CORE__DEFAULT_TIMEZONE: UTC + TZ: UTC _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-} AIRFLOW_CONFIG: "/opt/airflow/config/airflow.cfg" volumes: @@ -27,6 +27,10 @@ x-airflow-common: &airflow-common - ./airflow/requirements.txt:/requirements.txt - ./.secret:/opt/airflow/.secret - ./dbt_project:/opt/airflow/dbt_project + - ./scripts:/opt/airflow/scripts + - ./.env:/opt/airflow/.env + - ./.data:/opt/airflow/.data + - ./.ssh:/opt/airflow/.ssh user: "${AIRFLOW_UID:-50000}:0" networks: - lab_kafka_network diff --git a/scripts/kafka/consumer.py b/scripts/kafka/consumer.py index ef8db11..88ea3f5 100644 --- a/scripts/kafka/consumer.py +++ b/scripts/kafka/consumer.py @@ -16,6 +16,24 @@ from scripts.utils.logger import logger +def transfer_alert(transfer: TransferData) -> None: + """ + Dummy alert function to print transfer details. + + TODO: Replace with meaningful alert logic (e.g., threshold checks, notifications). + """ + logger.info( + f"🚨 TRANSFER ALERT 🚨\n" + f" ID: {transfer.id}\n" + f" Block: {transfer.block_number}\n" + f" Timestamp: {transfer.timestamp}\n" + f" Contract: {transfer.contract_address}\n" + f" From: {transfer.from_address}\n" + f" To: {transfer.to_address}\n" + f" Value: {int(transfer.value) / 10 ** 18}" + ) + + def main() -> None: """Main consumer loop - consumes from Kafka and loads to PostgreSQL.""" load_dotenv() @@ -39,7 +57,7 @@ def main() -> None: logger.info( f"Consumer config: group_id={consumer_config['group.id']}, " f"auto_commit={consumer_config['enable.auto.commit']}, " - f"batch_size={kafka_config.batch_size}" + f"mode=immediate (no batching)" ) if kafka_config.max_records: logger.info(f"Max records limit: {kafka_config.max_records}") @@ -50,35 +68,22 @@ def main() -> None: rows_written = 0 messages_processed = 0 - batch: list[TransferData] = [] - - def flush_batch() -> int: - """Flush current batch to database and return number of rows written.""" - nonlocal batch - if not batch: - return 0 - - inserted = batch_insert_transfers(dsn, batch) - batch.clear() - return inserted try: while True: # Check max records limit - if kafka_config.max_records and messages_processed >= kafka_config.max_records: + if ( + kafka_config.max_records + and messages_processed >= kafka_config.max_records + ): logger.info( - f"Reached max records limit ({kafka_config.max_records}). " - "Flushing final batch and exiting." + f"Reached max records limit ({kafka_config.max_records}). Exiting." ) - rows_written += flush_batch() break msg = consumer.poll(1.0) if msg is None: - # No message available - flush batch if it exists - if batch: - rows_written += flush_batch() - logger.debug("Flushed batch during idle period") + # No message available continue if msg.error(): @@ -92,7 +97,7 @@ def flush_batch() -> int: # Parse event with data model event = TransferEvent(**event_dict) - # Add to batch + # Create transfer data transfer_data = TransferData( id=event.id, block_number=event.block_number, @@ -102,11 +107,16 @@ def flush_batch() -> int: to_address=event.to_address, value=event.value, ) - batch.append(transfer_data) - # Flush batch when it reaches the configured size - if len(batch) >= kafka_config.batch_size: - rows_written += flush_batch() + # Trigger alert for this transfer + transfer_alert(transfer_data) + + # Insert immediately (no batching) + inserted = batch_insert_transfers(dsn, [transfer_data]) + rows_written += inserted + + # Log progress every 100 messages + if messages_processed % 100 == 0: logger.info( f"Progress: {messages_processed} messages processed, " f"{rows_written} rows written" @@ -118,8 +128,7 @@ def flush_batch() -> int: logger.error(f"Error processing message: {e}", exc_info=True) except KeyboardInterrupt: - logger.info("Shutdown requested. Flushing final batch...") - rows_written += flush_batch() + logger.info("Shutdown requested...") logger.info( f"Final stats: {messages_processed} messages, {rows_written} rows written" ) diff --git a/scripts/raw_data/README.md b/scripts/raw_data/README.md new file mode 100644 index 0000000..b1838f8 --- /dev/null +++ b/scripts/raw_data/README.md @@ -0,0 +1,222 @@ +# Raw Data Collection Scripts + +This directory contains scripts for collecting blockchain logs data using HyperSync. + +## Scripts Overview + +### 1. `collect_logs_data.py` (One-time/Historical) +Use this for **initial data collection** or **one-time historical queries**. + +**Example:** +```bash +# Collect logs from block 22269355 to 24107494 +uv run python scripts/raw_data/collect_logs_data.py \ + --contract 0x1234567890123456789012345678901234567890 \ + --from-block 22269355 \ + --to-block 24107494 \ + --contract-name open_logs +``` + +**Output:** `.data/open_logs_22269355_24107494.parquet` + +--- + +### 2. `collect_logs_incremental.py` (Daily/Scheduled) ⭐ +Use this for **daily/scheduled incremental collection** (e.g., via Airflow or cron). + +**Key Features:** +- ✅ Auto-detects the latest fetched block from existing parquet files +- ✅ Fetches only new data from `latest_block + 1` to current block +- ✅ No duplicate data collection +- ✅ Designed for automation + +**Example: Daily Collection (Auto-detect)** +```bash +# First run (requires --from-block) +uv run python scripts/raw_data/collect_logs_incremental.py \ + --contract 0x1234567890123456789012345678901234567890 \ + --prefix open_logs \ + --from-block 22269355 + +# Output: .data/open_logs_22269355_24107494.parquet + +# Second run (auto-detects from_block = 24107495) +uv run python scripts/raw_data/collect_logs_incremental.py \ + --contract 0x1234567890123456789012345678901234567890 \ + --prefix open_logs + +# Output: .data/open_logs_24107495_24500000.parquet +``` + +**Manual Override:** +```bash +# Specify exact block range (bypass auto-detection) +uv run python scripts/raw_data/collect_logs_incremental.py \ + --contract 0x1234567890123456789012345678901234567890 \ + --prefix open_logs \ + --from-block 24000000 \ + --to-block 24500000 +``` + +--- + +### 3. `get_events.py` (ABI Event Extraction) +Extract event signatures and topic0 hashes from a contract's ABI. + +**Example:** +```bash +uv run python scripts/raw_data/get_events.py \ + --chainid 1 \ + --contract 0x1234567890123456789012345678901234567890 \ + --name open_logs +``` + +**Output:** +- `.data/open_logs_events.json` (full event data with inputs) +- `.data/open_logs_events.csv` (topic0, event_name, signature) + +--- + +## Airflow DAG Integration + +The logic from `collect_logs_incremental.py` has been implemented as an Airflow DAG in [airflow/dags/collect_load_logs.py](../../airflow/dags/collect_load_logs.py). + +### DAG: `collect_and_load_logs_dag` + +**Purpose:** End-to-end ETL pipeline - Incrementally collect blockchain logs from HyperSync and load to Snowflake + +**Schedule:** Every 6 hours + +**Configuration:** +- Contract: `0x323c03c48660fE31186fa82c289b0766d331Ce21` +- Prefix: `open_logs` +- Data directory: `/opt/airflow/.data` +- Target table: `raw.open_logs` (Snowflake) +- Auto-detects `from_block` from existing files + +**Tasks:** +1. **collect_logs_incremental** - Collects new logs from HyperSync +2. **load_to_snowflake** - Loads the collected parquet file to Snowflake + +**Workflow:** +``` +Collect Logs → Load to Snowflake +``` + +**Output Filename Format:** +`{prefix}_{today}_{actual_from_block}_{actual_to_block}.parquet` + +Example: `open_logs_2026-01-01_24107495_24150000.parquet` + +**Key Features:** +- Auto-detects the latest fetched block from existing parquet files +- Collects new logs from that block to the current chain block +- Saves results with today's date and actual block range +- Automatically loads collected data to Snowflake +- Skips Snowflake load if no new data is collected +- Single DAG for complete ETL pipeline + +**Load Command Executed:** +```bash +python scripts/load_file.py \ + -f /opt/airflow/.data/open_logs_2026-01-01_24107495_24150000.parquet \ + -c snowflake \ + -s raw \ + -t open_logs \ + -w append +``` + +### File Naming Convention + +**Standalone Script Format:** +`{prefix}_{fromBlock}_{toBlock}.parquet` + +Example: `open_logs_22269355_24107494.parquet` + +**Airflow DAG Format:** +`{prefix}_{date}_{fromBlock}_{toBlock}.parquet` + +Example: `open_logs_2026-01-01_24107495_24150000.parquet` + +The date component helps with: +- Organizing files by collection date +- Easier troubleshooting and auditing +- Identifying daily incremental runs + +--- + +## Function: `get_latest_fetched_block()` + +Both scripts include this helper function to find the latest block from existing parquet files. + +**Usage:** +```python +from collect_logs_incremental import get_latest_fetched_block + +# Get latest block for 'open_logs' prefix +latest_block = get_latest_fetched_block('open_logs') +# Returns: 24107494 (from open_logs_22269355_24107494.parquet) + +# Custom data directory +latest_block = get_latest_fetched_block('open_logs', data_dir='.data/raw') +``` + +**How it works:** +1. Searches for files matching `{prefix}_*.parquet` +2. Extracts block numbers using regex: `{prefix}_{from_block}_{to_block}.parquet` +3. Returns the maximum `to_block` value +4. Returns `None` if no matching files exist + +--- + +## Environment Variables + +Required in `.env`: +```bash +HYPERSYNC_BEARER_TOKEN=your_token_here +``` + +Optional (for `get_events.py`): +```bash +ETHERSCAN_API_KEY=your_etherscan_api_key +``` + +--- + +## Workflow Recommendation + +**Initial Setup:** +```bash +# 1. Collect historical data (one-time) +uv run python scripts/raw_data/collect_logs_data.py \ + --contract 0xYourContract \ + --from-block 22000000 \ + --to-block 24000000 \ + --contract-name my_logs + +# 2. Setup scheduled incremental collection +# Add to Airflow DAG or cron job +``` + +**Daily Operations:** +```bash +# Runs automatically via Airflow/cron +# Collects only new data since last run +uv run python scripts/raw_data/collect_logs_incremental.py \ + --contract 0xYourContract \ + --prefix my_logs +``` + +--- + +## Troubleshooting + +**Error: "No existing data found for prefix"** +- Solution: Run with `--from-block` for the first collection + +**Error: "HYPERSYNC_BEARER_TOKEN environment variable is required"** +- Solution: Add `HYPERSYNC_BEARER_TOKEN=...` to `.env` file + +**Duplicate data concerns** +- The incremental script always starts from `latest_block + 1` +- No overlapping block ranges between runs diff --git a/scripts/raw_data/collect_logs_data.py b/scripts/raw_data/collect_logs_data.py index 6621bee..d11037a 100644 --- a/scripts/raw_data/collect_logs_data.py +++ b/scripts/raw_data/collect_logs_data.py @@ -4,6 +4,8 @@ import os, logging import argparse import polars as pl +from pathlib import Path +import re load_dotenv() @@ -11,6 +13,60 @@ logger = logging.getLogger(__name__) +def get_latest_fetched_block(prefix: str, data_dir: str = ".data") -> int | None: + """ + Get the latest (maximum) block number from parquet files matching a prefix. + + Args: + prefix: Prefix to search for (e.g., 'open_logs') + data_dir: Directory containing parquet files (default: '.data') + + Returns: + Maximum block number found, or None if no matching files exist + + Example: + >>> get_latest_fetched_block('open_logs') + 24107494 # from open_logs_22269355_24107494.parquet + """ + data_path = Path(data_dir) + + if not data_path.exists(): + logger.warning(f"Data directory does not exist: {data_dir}") + return None + + # Find all parquet files matching the prefix + pattern = f"{prefix}_*.parquet" + matching_files = list(data_path.glob(pattern)) + + if not matching_files: + logger.info(f"No parquet files found with prefix: {prefix}") + return None + + # Extract block numbers from filenames + # Expected format: prefix_fromBlock_toBlock.parquet + max_block = None + block_pattern = re.compile(rf"{re.escape(prefix)}_(\d+)_(\d+)\.parquet") + + for file in matching_files: + match = block_pattern.match(file.name) + if match: + from_block = int(match.group(1)) + to_block = int(match.group(2)) + + # Track the maximum to_block + if max_block is None or to_block > max_block: + max_block = to_block + + logger.debug(f"Found file: {file.name} (blocks {from_block}-{to_block})") + + if max_block: + logger.info(f"Latest fetched block for '{prefix}': {max_block}") + else: + logger.warning(f"No valid block numbers found in filenames for prefix: {prefix}") + + return max_block + + def get_client(): return hypersync.HypersyncClient( hypersync.ClientConfig( diff --git a/scripts/raw_data/collect_logs_incremental.py b/scripts/raw_data/collect_logs_incremental.py new file mode 100644 index 0000000..66eac04 --- /dev/null +++ b/scripts/raw_data/collect_logs_incremental.py @@ -0,0 +1,267 @@ +import hypersync +import asyncio +from dotenv import load_dotenv +import os +import logging +import argparse +import polars as pl +from pathlib import Path +import re +from datetime import datetime + +load_dotenv() + +logging.basicConfig( + level=logging.INFO, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s" +) +logger = logging.getLogger(__name__) + + +def get_latest_fetched_block(prefix: str, data_dir: str = ".data") -> int | None: + """ + Get the latest (maximum) block number from parquet files matching a prefix. + + Args: + prefix: Prefix to search for (e.g., 'open_logs') + data_dir: Directory containing parquet files (default: '.data') + + Returns: + Maximum block number found, or None if no matching files exist + + Example: + >>> get_latest_fetched_block('open_logs') + 24107494 # from open_logs_22269355_24107494.parquet + """ + data_path = Path(data_dir) + + if not data_path.exists(): + logger.warning(f"Data directory does not exist: {data_dir}") + return None + + # Find all parquet files matching the prefix + pattern = f"{prefix}_*.parquet" + matching_files = list(data_path.glob(pattern)) + + if not matching_files: + logger.info(f"No parquet files found with prefix: {prefix}") + return None + + # Extract block numbers from filenames + # Expected format: prefix_fromBlock_toBlock.parquet + max_block = None + block_pattern = re.compile(rf"{re.escape(prefix)}_(\d+)_(\d+)\.parquet") + + for file in matching_files: + match = block_pattern.match(file.name) + if match: + from_block = int(match.group(1)) + to_block = int(match.group(2)) + + # Track the maximum to_block + if max_block is None or to_block > max_block: + max_block = to_block + + logger.debug(f"Found file: {file.name} (blocks {from_block}-{to_block})") + + if max_block: + logger.info(f"Latest fetched block for '{prefix}': {max_block}") + else: + logger.warning( + f"No valid block numbers found in filenames for prefix: {prefix}" + ) + + return max_block + + +def get_client(): + """Initialize HyperSync client with bearer token authentication.""" + bearer_token = os.getenv("HYPERSYNC_BEARER_TOKEN") + if not bearer_token: + raise ValueError("HYPERSYNC_BEARER_TOKEN environment variable is required") + + return hypersync.HypersyncClient( + hypersync.ClientConfig( + url="https://eth.hypersync.xyz", + bearer_token=bearer_token, + ) + ) + + +async def collect_logs(contract: str, from_block: int, to_block: int | None = None): + """ + Collect historical logs data for a contract and save to Parquet. + + Args: + contract: Contract address to query + from_block: Starting block number + to_block: Ending block number (None for latest) + """ + client = get_client() + + # Build query + query = hypersync.preset_query_logs( + contract, from_block=from_block, to_block=to_block + ) + + logger.info(f"Collecting logs for contract: {contract}") + logger.info(f"From block: {from_block}") + logger.info(f"To block: {to_block if to_block else 'latest'}") + + await client.collect_parquet( + query=query, path=".data", config=hypersync.StreamConfig() + ) + + logger.info("Collection complete!") + + +async def collect_logs_incremental( + contract: str, + prefix: str, + data_dir: str = ".data", + to_block: int | None = None, + from_block: int | None = None, +) -> tuple[int, int, str]: + """ + Collect logs incrementally from the latest fetched block to the current block. + Designed for daily/scheduled execution. + + Args: + contract: Contract address to query + prefix: Filename prefix (e.g., 'open_logs') + data_dir: Directory for data files (default: '.data') + to_block: Ending block (None for latest chain block) + from_block: Starting block (None to auto-detect from existing files) + + Returns: + Tuple of (from_block, to_block, output_file_path) + + Raises: + ValueError: If no previous data exists and from_block is not provided + """ + client = get_client() + + # Auto-detect starting block if not provided + if from_block is None: + latest_block = get_latest_fetched_block(prefix, data_dir) + + if latest_block is None: + raise ValueError( + f"No existing data found for prefix '{prefix}'. " + "Please provide --from-block for initial data collection." + ) + + # Start from the next block after the latest fetched + from_block = latest_block + logger.info(f"Auto-detected starting block: {from_block}") + + logger.info(f"=== Incremental Log Collection ===") + logger.info(f"Contract: {contract}") + logger.info(f"Prefix: {prefix}") + logger.info(f"From block: {from_block}") + logger.info(f"To block: {to_block if to_block else 'latest'}") + + # Collect data to temporary file + temp_file = Path(data_dir) / "logs.parquet" + await collect_logs(contract=contract, from_block=from_block, to_block=to_block) + + # Check if file was created (HyperSync doesn't create file if no data) + if not temp_file.exists(): + logger.warning( + f"No new data found in block range {from_block} - {to_block if to_block else 'latest'}" + ) + logger.info("✓ Incremental collection complete (no new data)") + return from_block, from_block, None + + # Read actual block range from the collected data + try: + df = pl.scan_parquet(temp_file) + actual_from_block = df.select(pl.col("block_number").min()).collect().item() + actual_to_block = df.select(pl.col("block_number").max()).collect().item() + record_count = df.select(pl.len()).collect().item() + + logger.info(f"Collected {record_count} records") + logger.info(f"Actual block range: {actual_from_block} - {actual_to_block}") + + # Rename with actual block range + output_file = ( + Path(data_dir) / f"{prefix}_{actual_from_block}_{actual_to_block}.parquet" + ) + temp_file.rename(output_file) + + logger.info(f"✓ Saved to: {output_file}") + + return actual_from_block, actual_to_block, str(output_file) + + except Exception as e: + logger.error(f"Failed to process collected data: {e}") + raise + + +def main(): + parser = argparse.ArgumentParser( + description="Incrementally collect blockchain logs for daily/scheduled execution" + ) + parser.add_argument( + "--contract", + "-c", + required=True, + help="Contract address to query (e.g., 0x...)", + ) + parser.add_argument( + "--prefix", + "-p", + required=True, + help="Filename prefix for parquet files (e.g., 'open_logs')", + ) + parser.add_argument( + "--data-dir", + "-d", + default=".data", + help="Directory for data files (default: '.data')", + ) + parser.add_argument( + "--from-block", + "-f", + type=int, + default=None, + help="Starting block (default: auto-detect from existing files)", + ) + parser.add_argument( + "--to-block", + "-t", + type=int, + default=None, + help="Ending block (default: latest chain block)", + ) + + args = parser.parse_args() + + try: + from_block, to_block, output_file = asyncio.run( + collect_logs_incremental( + contract=args.contract, + prefix=args.prefix, + data_dir=args.data_dir, + from_block=args.from_block, + to_block=args.to_block, + ) + ) + + logger.info("=== Collection Summary ===") + logger.info(f"Block range: {from_block} → {to_block}") + if output_file: + logger.info(f"Output file: {output_file}") + logger.info("✓ Incremental collection complete!") + else: + logger.info("No new data collected") + + except ValueError as e: + logger.error(f"Configuration error: {e}") + exit(1) + except Exception as e: + logger.error(f"Collection failed: {e}", exc_info=True) + exit(1) + + +if __name__ == "__main__": + main() diff --git a/scripts/setup_airflow_snowflake_connection.sh b/scripts/setup_airflow_snowflake_connection.sh new file mode 100755 index 0000000..4975d4d --- /dev/null +++ b/scripts/setup_airflow_snowflake_connection.sh @@ -0,0 +1,76 @@ +#!/bin/bash + +# Script to set up Airflow Snowflake connection using .env variables +# Run this script from the project root directory + +set -e + +# Check if .env file exists +if [ ! -f ".env" ]; then + echo "Error: .env file not found in current directory" + echo "Please create a .env file with your Snowflake credentials" + exit 1 +fi + +# Source the .env file +source .env + +# Check if Airflow is running +if ! docker-compose -f docker-compose.airflow.yml ps | grep -q "airflow-scheduler"; then + echo "Error: Airflow is not running. Start it first: docker-compose -f docker-compose.airflow.yml up -d" + exit 1 +fi + +# Check if required Snowflake variables are set +if [ -z "$SNOWFLAKE_ACCOUNT" ] || [ -z "$SNOWFLAKE_USER" ]; then + echo "Error: Required Snowflake environment variables are not set in .env" + echo "Please set: SNOWFLAKE_ACCOUNT, SNOWFLAKE_USER, SNOWFLAKE_PRIVATE_KEY_FILE" + exit 1 +fi + +# Check if private key file exists +if [ ! -f "$SNOWFLAKE_PRIVATE_KEY_FILE" ]; then + echo "Error: Private key file not found at $SNOWFLAKE_PRIVATE_KEY_FILE" + echo "Please check SNOWFLAKE_PRIVATE_KEY_FILE in .env" + exit 1 +fi + +echo "Setting up Snowflake connection for account: $SNOWFLAKE_ACCOUNT" + +# Read and base64 encode the private key +PRIVATE_KEY_CONTENT=$(cat "$SNOWFLAKE_PRIVATE_KEY_FILE" | base64) + +# Build the extra JSON with all Snowflake parameters +EXTRA_JSON=$(cat </dev/null || true + +# Add new connection +docker-compose -f docker-compose.airflow.yml exec -T airflow-scheduler airflow connections add 'snowflake_default' \ + --conn-type 'snowflake' \ + --conn-login "$SNOWFLAKE_USER" \ + --conn-schema "$SNOWFLAKE_SCHEMA" \ + --conn-extra "$EXTRA_JSON" + +echo "✅ Snowflake connection setup complete" +echo "" +echo "Connection details:" +echo " Account: $SNOWFLAKE_ACCOUNT" +echo " User: $SNOWFLAKE_USER" +echo " Role: $SNOWFLAKE_ROLE" +echo " Warehouse: $SNOWFLAKE_WAREHOUSE" +echo " Database: $SNOWFLAKE_DATABASE" +echo " Schema: $SNOWFLAKE_SCHEMA" +echo "" +echo "You can verify the connection in Airflow UI:" +echo " http://localhost:8080/connection/list/"