Skip to content

Commit cde3a3a

Browse files
committed
Initial setup
0 parents  commit cde3a3a

76 files changed

Lines changed: 4177 additions & 0 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# C extensions
7+
*.so
8+
9+
# Distribution / packaging
10+
.Python
11+
build/
12+
develop-eggs/
13+
dist/
14+
downloads/
15+
eggs/
16+
.eggs/
17+
lib/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
pip-wheel-metadata/
24+
share/python-wheels/
25+
*.egg-info/
26+
.installed.cfg
27+
*.egg
28+
MANIFEST
29+
.idea/
30+
31+
# PyInstaller
32+
# Usually these files are written by a python script from a template
33+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
34+
*.manifest
35+
*.spec
36+
37+
# Installer logs
38+
pip-log.txt
39+
pip-delete-this-directory.txt
40+
41+
# Unit test / coverage reports
42+
htmlcov/
43+
.tox/
44+
.nox/
45+
.coverage
46+
.coverage.*
47+
.cache
48+
nosetests.xml
49+
coverage.xml
50+
*.cover
51+
*.py,cover
52+
.hypothesis/
53+
.pytest_cache/
54+
55+
# Translations
56+
*.mo
57+
*.pot
58+
59+
# Django stuff:
60+
*.log
61+
local_settings.py
62+
db.sqlite3
63+
db.sqlite3-journal
64+
65+
# Flask stuff:
66+
instance/
67+
.webassets-cache
68+
69+
# Scrapy stuff:
70+
.scrapy
71+
72+
# Sphinx documentation
73+
docs/_build/
74+
75+
# PyBuilder
76+
target/
77+
78+
# Jupyter Notebook
79+
.ipynb_checkpoints
80+
81+
# IPython
82+
profile_default/
83+
ipython_config.py
84+
85+
# pyenv
86+
.python-version
87+
88+
# pipenv
89+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
90+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
91+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
92+
# install all needed dependencies.
93+
#Pipfile.lock
94+
95+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
96+
__pypackages__/
97+
98+
# Celery stuff
99+
celerybeat-schedule
100+
celerybeat.pid
101+
102+
# SageMath parsed files
103+
*.sage.py
104+
105+
# Environments
106+
.env
107+
.venv
108+
env/
109+
venv/
110+
ENV/
111+
env.bak/
112+
venv.bak/
113+
114+
# Spyder project settings
115+
.spyderproject
116+
.spyproject
117+
118+
# Rope project settings
119+
.ropeproject
120+
121+
# mkdocs documentation
122+
/site
123+
124+
# mypy
125+
.mypy_cache/
126+
.dmypy.json
127+
dmypy.json
128+
129+
# Pyre type checker
130+
.pyre/

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2022 Vlad
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
# Multi-agent Path Finding using Reinforcement Learning
2+
3+
4+
![PyTorch](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?style=flat&logo=PyTorch&logoColor=white)
5+
![Poetry](https://img.shields.io/badge/Poetry-%2300C4CC.svg?style=flat&logo=Poetry&logoColor=white)
6+
![Black](https://img.shields.io/badge/code%20style-black-000000.svg)
7+
8+
## Description
9+
10+
**Multi-agent pathfinding in partially observable environments. Search-based vs. RL-based algorithms.**
11+
12+
The main goal of this repository is to provide a DHC [1] model implementation alongside with some benchmarks and charts.
13+
We also aim to compare the performance of the DHC model with the basic M* algorithm.
14+
15+
## Requirements
16+
In order for `models.dhc.train` to be successfully run, you have to have a machine equipped with 1 GPU and several CPUs.
17+
Consider having `num_cpus - 2` actors configured through the `dhc.train.num_actors` in `config.yaml`
18+
19+
**Attention: We do not guarantee the desired performance on a non-GPU machine.**
20+
21+
While we aim at supporting MacOS, Linux and Windows platforms, the successful training is not guaranteed on a Windows-based machine.
22+
The benchmarking script should work there, though. Please report it [here](https://github.com/acforvs/po-mapf-thesis/issues) if it doesn't.
23+
24+
## Setting up
25+
1. Install [Poetry](https://python-poetry.org)
26+
2. Run [poetry install](https://python-poetry.org/docs/cli/#install) to install the dependencies
27+
28+
If you see ``Failed to create the collection: Prompt dismissed..`` this error when trying to run `poetry install`, [consider](https://github.com/python-poetry/poetry/issues/1917#issuecomment-1251667047) executing this line first:
29+
```shell
30+
export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring
31+
```
32+
33+
## Repository description & Usage
34+
1. `models` dir contains the weights of the trained models
35+
2. `config.yaml` - training & model params, environmental settings etc.
36+
3. `pathfinding/models` provides one with the implementation of different models
37+
38+
## Models
39+
### DHC
40+
41+
**D**istributed, **H**euristic and **C**ommunication [1]
42+
43+
> To guide RL algorithm on long-horizon goal-oriented tasks, we embed the potential choices of shortest paths from single source as heuristic guidance instead of using a specific path as in most existing works. Our method treats each agent independently and trains the model from a single agent’s perspective. The final trained policy is applied to each agent for decentralized execution. The whole system is distributed during training and is trained under a curriculum learning strategy.
44+
45+
![visAfter](./static/DHC_architecture.png)
46+
![visAfter](./static/DHC_training.png)
47+
48+
<details>
49+
<summary>DHC</summary>
50+
51+
#### Benchmarking
52+
53+
**To run the generated test suite, run**
54+
```shell
55+
poetry run python3 pathfinding/models/dhc/evaluate.py test_model TESTS_DESCR MODEL_ID
56+
```
57+
where
58+
* TESTS_DESCR is a string of the format `'[(map_length, num_agents, density), ...]'` (you may want to copy this line from the generation command)
59+
* MODEL_ID is the name of the file from the `models` dir
60+
For example, by running
61+
62+
```shell
63+
poetry run python3 pathfinding/models/dhc/evaluate.py test_model '[(40, 16, 0.3), (80, 4, 0.1)]' 60000
64+
```
65+
you will benchmark the `models/60000.pth` on the provided test cases
66+
67+
**Attention: the test cases must be generated first!**
68+
69+
#### Training
70+
1. Set the desired `actors` amount by setting the appropriate value for `dhc.train.num_actors` in `config.yaml`
71+
72+
It is recommended to use the amount of CPU cores on you machine minus 2
73+
74+
2. To initialize training, run
75+
```shell
76+
poetry run python3 pathfinding/models/dhc/train.py
77+
```
78+
79+
The `models` dir will be created afterwards where the weights of the intermediate models will be saved.
80+
81+
#### Visualizing
82+
83+
1. To visualize the results, run
84+
```shell
85+
poetry run python3 pathfinding/models/dhc/visualize.py MODEL_ID TEST_NAME TEST_ID
86+
```
87+
where
88+
* MODEL_ID is the name of the file from the `models` dir
89+
* TEST_NAME is the name of the file with tests, for example `80length_32agents_0.3density.pkl`
90+
* TEST_ID [optional], id of the test from the provided test suite
91+
92+
</details>
93+
94+
## The setup
95+
The DHC network was trained on a single [NVIDIA TESLA T4 GPU](https://www.nvidia.com/en-us/data-center/tesla-t4/) for 7 hours.
96+
97+
We used 20 CPU cores, 18 were used for the actors, additionally, 2 cores were used for the Learner and GlobalBuffer all together.
98+
99+
100+
## DHC Results
101+
102+
**Our trained model outperforms PRIMAL benchmarks**
103+
104+
![visAfter](./static/chart_40x40.png)
105+
![visAfter](./static/chart_80x80.png)
106+
107+
![visAfter](./static/DHC_10x10_4_good.gif)
108+
![visAfter](./static/DHC_40x40_4_good.gif)
109+
![visAfter](./static/DHC_40x40_16_good.gif)
110+
![visAfter](./static/DHC_40x40_16_dense.gif)
111+
112+
113+
## Contributing
114+
<details>
115+
<summary>See the detailed contribution guide</summary>
116+
117+
1. Install [black](https://github.com/psf/black), you can likely run
118+
```shell
119+
pip3 install black
120+
```
121+
122+
3. Use [black](https://github.com/psf/black) to ensure that the codestyle remains great
123+
```shell
124+
poetry run black dir
125+
```
126+
2. Make sure tests are OK
127+
```shell
128+
poetry run pytest
129+
```
130+
3. Create a PR with new features
131+
</details>
132+
133+
## References
134+
135+
<a id="1">[1]</a>
136+
Ma, Ziyuan and Luo, Yudong and Ma, Hang, 2021. Distributed Heuristic Multi-Agent Path Finding with Communication.
137+
138+
<a id="2">[2]</a>
139+
Sartoretti, G., Kerr, J., Shi, Y., Wagner, G., Kumar, T.S., Koenig, S. and Choset, H., 2019. Primal: Pathfinding via reinforcement and imitation multi-agent learning. IEEE Robotics and Automation Letters, 4(3), pp.2378-2385.
140+
141+
## License
142+
143+
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/acforvs/po-mapf-thesis/blob/main/LICENSE)
144+
145+

config.yaml

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
dhc:
2+
cnn_channels: 128
3+
fov: !!python/tuple [9, 9]
4+
observation_radius: 4 # since the FOV is 9x9
5+
observation_shape: !!python/tuple [6, 9, 9]
6+
hidden_dim: 256
7+
max_comm_agents: 3 # includes the agent itself
8+
batch_size: 192
9+
max_num_agents: 2
10+
latent_dim: 784 # 16 * 7 * 7, do not forget to change if the observation_shape is changed
11+
max_episode_length: 256
12+
13+
communication:
14+
num_comm_layers: 2
15+
num_comm_heads: 2
16+
17+
buffer:
18+
action_dim: 5
19+
forward_steps: 2
20+
21+
worker:
22+
episode_capacity: 2048
23+
init_env_settings: !!python/tuple [ 1, 10 ]
24+
max_comm_agents: 3
25+
prioritized_replay_alpha: 0.6
26+
prioritized_replay_beta: 0.4
27+
forward_steps: 2
28+
seq_len: 16
29+
max_map_length: 40
30+
pass_rate: 0.9
31+
learning_starts: 50000
32+
training_times: 600000
33+
target_network_update_freq: 2000
34+
save_interval: 2000
35+
actor_update_steps: 400
36+
37+
train:
38+
num_actors: 6
39+
log_interval: 10
40+
41+
42+
environment:
43+
map_length: 50
44+
num_agents: 2
45+
observation_radius: 4
46+
reward_fn:
47+
move: -0.075
48+
stay_on_goal: 0
49+
stay_off_goal: -0.075
50+
collision: -0.5
51+
finish: 3
52+
53+
init_env_settings: !!python/tuple [1, 10]
54+
observation_shape: !!python/tuple [6, 9, 9]
55+
action_dim: 5

0 commit comments

Comments
 (0)