What Drives Learned Optimizer Performance? A Systematic Evaluation

This repo includes the code base used in the paper "What Drives Learned Optimizer Performance? A Systematic Evaluation", as was submitted for review for the EDBT/ICDT 2026 Joint Conference.

Repository Structure

The repository is organized as follows:

.
├── benchmark_scripts/              # Scripts to populate the database after startup (Setup)
├── installation_scripts/           # Scripts to configure the database (Setup)
├── workloads/                      # Base workloads used for training and evaluation (Base Workloads)
├── experiments/                    # Experiment inputs, results, and analysis notebooks (Experiments)
│   └── experimentID/
│       └── benchmark/
│           └── runID/
│               └── queryID/       # Individual query results, e.g., experiment1/job/run1/29c
│   └── *.ipynb                     # Analysis notebooks with figures (Results - Figures)
├── preprocessing/                  # Utility scripts: workload processing, SQL parsing, etc. (Utility Scripts)
├── optimizers/                     # Integrated learned query optimizers (Learned Query Optimizers)
│   ├── balsa/
│   ├── BaoForPostgreSQL/
│   ├── BASE/
│   ├── FASTgres-PVLDBv16/
│   ├── Lero-on-PostgreSQL/
│   ├── LOGER/
│   └── Neo/
├── docker_instructions.md          # Guide for building and managing Docker environments (Setup)
├── citations.md                    # Bibliographic references for workloads and related work (Citations)

Setup

This repository serves as a testbed for evaluating learned query optimizers (LQOs). Our experimental environment consisted of two main components:

CPU-only server: Hosted multiple Docker environments (each with 40 GB of shared memory) running PostgreSQL v12.5 instances. These instances acted as the execution backends for the learned query optimizers.
GPU server: Used to train and run each LQO implementation.

For most users, it is not necessary to replicate this multi-server configuration. A single machine with Docker installed and GPU access is sufficient for experimentation.

Refer to the setup guide for detailed instructions on building, configuring, and managing Docker containers. Each optimizer may require specific dependencies or database configurations—please consult the corresponding README files listed below.

Learned Query Optimizers

This repository integrates multiple state-of-the-art learned query optimizers. Setup instructions, additional notes, and guidance on loading checkpoints for each experiment are provided below:

NEO: See optimizers/balsa/README.md (Note: NEO is executed through the Balsa codebase, as the official NEO implementation was not publicly released.)
BAO: See optimizers/BaoForPostgreSQL/README.md
LOGER: See optimizers/LOGER/README.md
FASTgres: See optimizers/FASTgres-PVLDBv16/README.md
LERO: See optimizers/Lero-on-PostgreSQL/README.md (Note: LERO requires non-default PostgreSQL configurations; refer to its README for setup details.)

Experiments

Our evaluation framework examines several key aspects of learned query optimizer performance. Each experiment directory includes documentation describing its objectives, methodology, and any special database configurations required.

Models

We provide pretrained model checkpoints for each experiment and optimizer in the following Hugging Face repository: 👉 LQO Evaluation Suite Models

Please refer to each optimizer’s documentation for detailed instructions on which model checkpoint to load for each experiment.

Results - Figures

The results of each experiment, along with the figures reported in the publication, are available in the Jupyter notebooks below:

Base Workloads

The learned query optimizers (LQOs) were trained and evaluated using the following workloads.

Utility Scripts

The preprocessing directory contains a separate repository implementing utility tools for supporting experiments. These include:

An SQL parser
Scripts to transform workloads into directories compatible with our experiment structure
Programs to calculate workload distributions (as used in Experiment E5)
PostgreSQL warm-up calls
Additional miscellaneous utilities

For more details, refer to the preprocessing README.

Citations

We thank the authors of the previous work for making their research publicly available. For full citation details, please refer to the citations file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What Drives Learned Optimizer Performance? A Systematic Evaluation

Repository Structure

Setup

Learned Query Optimizers

Experiments

Models

Results - Figures

Base Workloads

Utility Scripts

Citations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
benchmark_scripts		benchmark_scripts
experiments		experiments
installation_scripts		installation_scripts
optimizers		optimizers
preprocessing		preprocessing
workloads		workloads
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
README.md		README.md
citations.md		citations.md
docker-compose.yml		docker-compose.yml
docker_instructions.md		docker_instructions.md
start_postgresql.sh		start_postgresql.sh

athenarc/Learned-Optimizers-Benchmarking-Suite

Folders and files

Latest commit

History

Repository files navigation

What Drives Learned Optimizer Performance? A Systematic Evaluation

Repository Structure

Setup

Learned Query Optimizers

Experiments

Models

Results - Figures

Base Workloads

Utility Scripts

Citations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Packages