Codebase for "Structure Detection for Contextual Reinforcement Learning (Under Review)"
SD-MBTL/
├── data/ # Data
├── figures/ # Figures
├── tables/ # Tables
├── algs.py # Algorithms
├── utils.py # Utility functions
├── main.py # Main function
├── environment.yml # Environment file
├── README.md # This file
├── plot_figure.ipynb # Jupyter notebook for plotting figures
├── s-run-main-synt.sh # Shell script for running main.py with 3d synthetic data
├── s-run-main-synt5d.sh # Shell script for running main.py with 5d synthetic data
├── s-run-main-synt7d.sh # Shell script for running main.py with 7d synthetic data
├── s-run-main.sh # Shell script for running main.py with real-world data
└── LICENSE # License
conda env create -f environment.yml
We consider two types of tasks: synthetic and real-world tasks. The synthetic tasks are generated using the gen_synthetic_data.ipynb, while the real-world tasks are generated using four environments under envs/.
The synthetic tasks are generated using the gen_synthetic_data.ipynb notebook. The notebook generates the data and saves it in the data/ directory. You can run the notebook to generate the data.
The real-world tasks are generated using the following environments:
You can download the synthetic and real-world datasets from the following links:
After downloading the data, unzip it and place the contents in the data/ directory.
Run SD-MBTL and baselines in synthetic tasks (CartPole, BipedalWalker, IntersectionZoo, and CyclesGym)
3 dimensional tasks:
bash s-run-main-synt.sh5 dimensional tasks:
bash s-run-main-synt5d.sh7 dimensional tasks:
bash s-run-main-synt7d.shRun SD-MBTL and baselines in real-world tasks (CartPole, BipedalWalker, IntersectionZoo, and CyclesGym)
bash s-run-main.shYou can find the results in the tables and figures directories. The results are generated by running the plot_figure.ipynb notebook.
3 dimensional
5 dimensional
7 dimensional

The aggregated performance scales each MBTL-based algorithm’s performance between 0 and 1—reflecting how much it outperforms the Random baseline and how closely it approaches the Myopic Oracle—averaged across four benchmarks.

This project is licensed under the MIT License. See the LICENSE file for details. Each benchmark environment is licensed under its own license. Please refer to the respective repositories for more information.
This codebase is built upon the following repositories:
Coming soon
This is anonymous codebase for the paper "Structure Detection for Contextual Reinforcement Learning (Under Review)". The code is not intended for public release. If you are interested in the code or have any questions, please contact the authors after the review process is completed.
