This repo contains example DAGs that were used in an Astronomer webinar on DAG writing best practices.
The first example highlights DAG design principles. The hypothetical use case is we need to execute queries for a list of states that select data for today's and yesterday's dates. Once the queries have been completed successfully we send an email notification.
- bad-example-1.py shows an example DAG to complete this use case that uses bad DAG writing practices
- good-example-1.py shows an example DAG to complete this use case that uses DAG writing best practices
The second example highlights the concept of using Airflow as an orchestrator, not an execution framework. The hypothetical use case is a basic ETL; we need to refresh a materialized view in a Postgres database, then perform multiple transformations on that data, before loading it into another table. The data is relatively large (measured in gigabytes).
- bad-example-2.py shows an example DAG to complete this use case that uses bad DAG writing practices
- good-example-2.py shows an example DAG to complete this use case that uses DAG writing best practices
The easiest way to run these example DAGs is to use the Astronomer CLI to get an Airflow instance up and running locally:
- Install the Astronomer CLI
- Clone this repo somewhere locally and navigate to it in your terminal
- Initialize an Astronomer project by running
astro dev init
- Start Airflow locally by running
astro dev start
- Navigate to localhost:8080 in your browser and you should see the tutorial DAGs there