Meta-gradient incentive design

This is the code for experiments in the paper Adaptive Incentive Design with Multi-Agent Meta-Gradient Reinforcement Learning, published at AAMAS 2022. Baselines are included.

Setup

$ python3.6 -m venv <name of your venv>
$ source <venv>/bin/activate
$ pip install --upgrade pip
$ git clone https://github.com/011235813/metagradient-incentive-design.git
$ cd metagradient-incentive-design && pip install -e .
$ pip install -r requirements.txt
Clone and pip install Sequential Social Dilemma, which is a fork from the original open-source implementation.
Clone and pip install AI Economist, which is a fork from the original

Navigation

alg/ - Implementation of MetaGrad and dual-RL baselines
configs/ - Experiment configuration files. Hyperparameters are specified here.
env/ - Implementation of 1) Escape Room game, 2) wrapper around the SSD environment, 3) wrapper around the Gather-Trade-Build scenario of the Foundation environment
results/ - Results of training will be stored in subfolders here. Each independent training run will create a subfolder that contains the final Tensorflow model, and reward log files. For example, training MetaGrad without curriculum on the 15x15 GTB map of Foundation would create results/foundation/15x15_nocurr_m1 (depending on configurable strings in config files).
utils/ - Utility methods

Examples

Train MetaGrad on Escape Room

Set config values in configs/config_er_pg.py
cd into the alg folder
Execute training script $ python train_er.py pg.

Train MetaGrad on Cleanup

Set config values in configs/config_ssd.py
cd into the alg folder
Execute training script $ python train_ssd.py ac.

Train MetaGrad on GTB

Training without curriculum

Set config values in configs/config_foundation_ppo.py
cd into the alg folder
Execute training script $ python train_foundation.py ppo.

To enable curriculum learning, i.e. use a policy pretrained on a free-market scenario

Set config values in configs/config_foundation_ppo_curriculum.py
The pretrained model is located at results/foundation/15x15_phase1_free_market/model.ckpt
cd into the alg folder
Execute training script $ python train_foundation.py curr.

Citation

@inproceedings{yang2022adaptive,
  title={Adaptive Incentive Design with Multi-Agent Meta-Gradient Reinforcement Learning},
  author={Yang, Jiachen and Wang, Ethan and Trivedi, Rakshit and Zhao, Tuo and Zha, Hongyuan},
  booktitle={Proceedings of the 21st International Conference on Autonomous Agents and MultiAgent Systems},
  pages={1436--1445},
  year={2022}
}

License

See LICENSE.

SPDX-License-Identifier: MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
incentive_design		incentive_design
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Meta-gradient incentive design

Setup

Navigation

Examples

Train MetaGrad on Escape Room

Train MetaGrad on Cleanup

Train MetaGrad on GTB

Citation

License

About

Uh oh!

Releases

Packages

Languages

License

011235813/metagradient-incentive-design

Folders and files

Latest commit

History

Repository files navigation

Meta-gradient incentive design

Setup

Navigation

Examples

Train MetaGrad on Escape Room

Train MetaGrad on Cleanup

Train MetaGrad on GTB

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages