In this project we introduce MIMIC-D, a CTDE (Centralized Training, Decentralized Execution) framework that learns decentralized diffusion policies from multi-agent expert demonstrations to recover diverse, coordinated behaviors without explicit inter-agent communication.
Many real-world multi-agent tasks have multiple valid coordination modes (e.g., pass-left vs pass-right) and cannot assume reliable centralized planners or explicit communication. MIMIC-D trains policies jointly with full information, then executes with only local observations, enabling implicit coordination while preserving multi-modality in the learned behaviors. We validate MIMIC-D in multiple simulation environments and on a bimanual hardware setup with heterogeneous arms (Kinova3 + xArm7).
dependencies/
— conda environment file (it may be easier to simply install dependencies as you go)lift/
— simulated two-arm pot lifting experiment in robosuitelift_hardware/
— two-arm pot lifting experiment on Kinova3 and xArm7 on hardwarethree_agent_road/
— three agent road crossing environmenttwo_agent_swap/
— two agent swap environmentdocs/
— all the elements to build the project website
- TODO: environment setup (conda, CUDA/cuDNN, PyTorch version, robosuite, etc.)
- TODO: data preparation (where to download / how to format expert demos)
- TODO: training (commands & key flags)
- TODO: sampling / evaluation (receding-horizon execution, metrics, plotting)
If you use MIMIC-D, please cite:
TBD
Our diffusion transformer architecture is largely based on the AlignDiff code.