This repo includes the code base used in the paper "What Drives Learned Optimizer Performance? A Systematic Evaluation", as was submitted for review for the EDBT/ICDT 2026 Joint Conference.
The repository is organized as follows:
.
├── benchmark_scripts/ # Scripts to populate the database after startup (Setup)
├── installation_scripts/ # Scripts to configure the database (Setup)
├── workloads/ # Base workloads used for training and evaluation (Base Workloads)
├── experiments/ # Experiment inputs, results, and analysis notebooks (Experiments)
│ └── experimentID/
│ └── benchmark/
│ └── runID/
│ └── queryID/ # Individual query results, e.g., experiment1/job/run1/29c
│ └── *.ipynb # Analysis notebooks with figures (Results - Figures)
├── preprocessing/ # Utility scripts: workload processing, SQL parsing, etc. (Utility Scripts)
├── optimizers/ # Integrated learned query optimizers (Learned Query Optimizers)
│ ├── balsa/
│ ├── BaoForPostgreSQL/
│ ├── BASE/
│ ├── FASTgres-PVLDBv16/
│ ├── Lero-on-PostgreSQL/
│ ├── LOGER/
│ └── Neo/
├── docker_instructions.md # Guide for building and managing Docker environments (Setup)
├── citations.md # Bibliographic references for workloads and related work (Citations)
This repository serves as a testbed for evaluating learned query optimizers (LQOs). Our experimental environment consisted of two main components:
-
CPU-only server: Hosted multiple Docker environments (each with 40 GB of shared memory) running PostgreSQL v12.5 instances. These instances acted as the execution backends for the learned query optimizers.
-
GPU server: Used to train and run each LQO implementation.
For most users, it is not necessary to replicate this multi-server configuration. A single machine with Docker installed and GPU access is sufficient for experimentation.
Refer to the setup guide for detailed instructions on building, configuring, and managing Docker containers. Each optimizer may require specific dependencies or database configurations—please consult the corresponding README files listed below.
This repository integrates multiple state-of-the-art learned query optimizers. Setup instructions, additional notes, and guidance on loading checkpoints for each experiment are provided below:
-
NEO: See optimizers/balsa/README.md (Note: NEO is executed through the Balsa codebase, as the official NEO implementation was not publicly released.)
-
LOGER: See optimizers/LOGER/README.md
-
FASTgres: See optimizers/FASTgres-PVLDBv16/README.md
-
LERO: See optimizers/Lero-on-PostgreSQL/README.md (Note: LERO requires non-default PostgreSQL configurations; refer to its README for setup details.)
Our evaluation framework examines several key aspects of learned query optimizer performance. Each experiment directory includes documentation describing its objectives, methodology, and any special database configurations required.
- (E1) End-to-End Performance & Value Model Fidelity
- (E2) Sensitivity & Execution Stability
- (E3) Learning Trajectory & Convergence
- (E4) Internal Decision-Making & Plan Representation
- (E5) Generalization to Novel Conditions
We provide pretrained model checkpoints for each experiment and optimizer in the following Hugging Face repository: 👉 LQO Evaluation Suite Models
Please refer to each optimizer’s documentation for detailed instructions on which model checkpoint to load for each experiment.
The results of each experiment, along with the figures reported in the publication, are available in the Jupyter notebooks below:
The learned query optimizers (LQOs) were trained and evaluated using the following workloads.
- Join Order Benchmark (JOB)
- TPC-H
- TPC-DS
- JOB-Synthetic
- JOB-Dynamic
- JOB-Extended
- JOB-Light
- Star Schema Benchmark (SSB)
- SSB – After Schema Transformation
- STATS
- STATS – Sampled
The preprocessing directory contains a separate repository implementing utility tools for supporting experiments. These include:
- An SQL parser
- Scripts to transform workloads into directories compatible with our experiment structure
- Programs to calculate workload distributions (as used in Experiment E5)
- PostgreSQL warm-up calls
- Additional miscellaneous utilities
For more details, refer to the preprocessing README.
We thank the authors of the previous work for making their research publicly available. For full citation details, please refer to the citations file.