Cooperative Multi-Stage Multi-Goal Multi-Agent Reinforcement Learning (CM3)

This repository provides code for experiments in the paper CM3[1], published in ICLR 2020. It contains the main algorithm and baselines, and the three simulation Markov games on which algorithms were evaluated.

Dependencies

All experiments were run on Ubuntu 16.04
Python 3.6
TensorFlow 1.10
SUMO
pygame: sudo apt-get install python-pygame
OpenAI Gym 0.12.1

Project structure

alg: Implementation of algorithms and config files. config.json is the main config file. config_particle_*.json specifies various instances of the cooperative navigation task. config_sumo_stage{1,2}.json specifies agent initial/goal lane configurations for SUMO. config_checkers_stage{1,2}.json specifies parameters of the Checkers game.
env: Python wrappers/definitions of the simulation environments.
env_sumo: XML files that define the road and traffic for the underlying SUMO simulator.
log: Each experiment run will create a subfolder that contains the reward values logged during the training or test run.
saved: Each experiment run will create a subfolder contains trained TensorFlow models.

Environments

There are three simulations, selected by the experiment field in alg/config.json.

Cooperative navigation: particles must move to individual target locations while avoiding collisions.
- Environment code located in env/multiagent-particle-envs/
SUMO
- Stage 1: single agent on empty road. Corresponds to setting "stage" : 1
- Stage 2: two agents on empty road. Corresponds to setting "stage" : 2
- Python wrappers located in env/. Entry point is env/multicar_simple.py
- SUMO topology and traffic defined in env_sumo/simple/
Checkers: two agents cooperate to collect rewards while avoiding penalties in a checkered map.
- Implemented in env/checkers.py

Environment setup

Cooperative navigation: run pip install -e . inside env/multiagent-particle-envs/
SUMO: Install SUMO and add the following to your .bashrc

export PYTHONPATH=$PYTHONPATH:path/to/sumo
export PYTHONPATH=$PYTHONPATH:path/to/sumo/tools
export SUMO_HOME="path/to/sumo"

Checkers: None required

Training

Environment-specific examples

Cooperative navigation

In config.json, set
- experiment: "particle"
- particle_config should be one of config_particle_stage1.json, config_particle_stage2_antipodal.json, config_particle_stage2_cross.json, config_particle_stage2_merge.json
Inside alg/, execute python train_onpolicy.py

SUMO

In config.json, set
- experiment: "sumo"
- port: if multiple SUMO experiments are run in parallel, each experiment must have its unique number
Inside alg/, execute python train_offpolicy.py --env ../env_sumo/simple/merge.sumocfg
Include the option --gui to show SUMO GUI while training (at the cost of increased runtime)

Checkers

In config.json, set
- experiment : "checkers"
Inside alg/, execute python train_offpolicy.py

General notes for running Stage 1 and 2 of CM3

stage: either 1 or 2
dir_restore: for Stage 2 of CM3, this must be equal to the string for dir_name when Stage 1 was run.
use_alg_credit: 1 for CM3
use_Q_credit: 1 for CM3. 0 for ablation that uses value function baseline.
train_from_nothing: 1 for Stage 1 of CM3, or the ablation that omits the curriculum. 0 to allow restoring a trained Stage 1 model.
model_name: when training Stage 2 and restoring a Stage 1 model, this must be the name of the model in Stage 1.
prob_random: 1.0 for Stage 1, 0.2 for Stage 2. Not applicable for Checkers.

Citation

@inproceedings{yang2019cm3,
  title={CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning},
  author={Yang, Jiachen and Nakhaei, Alireza and Isele, David and Fujimura, Kikuo and Zha, Hongyuan},
  booktitle={International Conference on Learning Representations},
  year={2019}
}

License

See LICENSE.

SPDX-License-Identifier: MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cooperative Multi-Stage Multi-Goal Multi-Agent Reinforcement Learning (CM3)

Dependencies

Project structure

Environments

Environment setup

Training

Environment-specific examples

General notes for running Stage 1 and 2 of CM3

Citation

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
alg		alg
env		env
env_sumo/simple		env_sumo/simple
log		log
saved		saved
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

011235813/cm3

Folders and files

Latest commit

History

Repository files navigation

Cooperative Multi-Stage Multi-Goal Multi-Agent Reinforcement Learning (CM3)

Dependencies

Project structure

Environments

Environment setup

Training

Environment-specific examples

General notes for running Stage 1 and 2 of CM3

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages