ELT-Bench

The first comprehensive, end-to-end benchmark designed to evaluate AI agents in automating ELT pipelines.

Environment Setup

Install Docker and Conda

Ensure your machine has the Docker environment and the Conda environment installed.

Install Airbyte

You can deploy Airbyte Open Source by following the official documentation.
Note: You may need to add sudo before abctl commands.

Setup Airbyte

Navigate to http://localhost:8000/ in your web browser. Set your username. To retrieve your password, execute:
```
(sudo) abctl local credentials
```
In the Airbyte UI, go to Builder > Import a YAML. Upload the YAML file located at ./setup/elt_bench.yaml. Click on the Publish button, type ignore warnings, and publish it to your workspace.
In the Airbyte UI, go to Sources > Custom > ELT Bench. Retrieve the Workspace ID and Definition ID from the URL:
```
http://localhost:8012/workspaces/<workspace_id>/source/new-source/<api_definition_id>
```
Update the file ./setup/airbyte/airbyte_credentials.json by filling in the following information: username, password, workspace ID, and API definition ID.

Install psql

To insert data into PostgreSQL without installing the complete PostgreSQL database server, you can use the psql command-line tool. Please refer to the installation instructions to install psql on your machine. After successful installation, you can confirm the installation by running:
```
psql --version
```

Set up data destination - Snowflake

Refer to the example in ./setup/destination/setup.sql. Copy all the contents into a Snowflake worksheet and execute "Run all" to create the necessary credentials.
Fill in the required values in ./setup/destination/snowflake_credential to ensure Airbyte can successfully connect to Snowflake.

Run ELT setup

Execute the script to create Docker containers for various sources, download both source data and ground truth results for evaluation, and insert the data.
```
cd ./setup
bash elt_setup.sh
```

Running agents

To evaluate the Spider-Agent and SWE-agent on ELT-Bench, follow the instructions in the agents folder. This folder contains detailed steps for running each agent.

Evaluation

To evaluate the performance of an agent, use the following commands:
```
cd evaluation
python eva.py --folder folder_name
```
Replace folder_name with your desired name for the evaluation results. The newly created folder with the results will be located at ./evaluation/agent_results.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
agents		agents
documentation		documentation
elt-bench		elt-bench
elt-docker		elt-docker
evaluation		evaluation
example		example
materials		materials
setup		setup
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ELT-Bench

Environment Setup

Install Docker and Conda

Install Airbyte

Setup Airbyte

Install psql

Set up data destination - Snowflake

Run ELT setup

Running agents

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

uiuc-kang-lab/ELT-Bench

Folders and files

Latest commit

History

Repository files navigation

ELT-Bench

Environment Setup

Install Docker and Conda

Install Airbyte

Setup Airbyte

Install psql

Set up data destination - Snowflake

Run ELT setup

Running agents

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages