The official codebase for running the experiments in SpatialPolicy experiments.
This repository provides scripts and checkpoints for running Meta-World, iTHOR and Real-World benchmarks using the Spatial Policy model.
You can find codebase for training video policies here.
Spatial Policy: Guiding Visuomotor Robotic Manipulation with Spatial-Aware Modeling and Reasoning
Yijun Liu, Yuwei Liu, Yuan Meng, JieHeng Zhang, Yuwei Zhou, Ye Li, Jiacheng Jiang, Kangye Ji, Shijia Ge, Zhi Wang, Wenwu Zhu
If you find this work useful in your research, please cite:
@article{liu2025spatialpolicy,
author = {Liu, Yijun and Liu, Yuwei and Meng, Yuan and Zhang, JieHeng and Zhou, Yuwei and Li, Ye and Jiang, Jiacheng and Ji, Kangye and Ge, Shijia and Wang, Zhi and Zhu, Wenwu},
title = {Spatial Policy: Guiding Visuomotor Robotic Manipulation with Spatial-Aware Modeling and Reasoning},
journal = {arXiv preprint arXiv:2508.15874},
year = {2025},
eprint = {2508.15874},
archivePrefix = {arXiv},
primaryClass = {cs.RO}
}We recommend creating a new Python environment with PyTorch 2.2.0 + CUDA 11.8:
conda create -n SP_exp python=3.9
conda activate SP_exp
conda install pytorch=2.2.0 torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidiaClone this repository and install the requirements:
git clone https://github.com/PlantPotatoOnMoon/SP_exp.git
cd SP_exp
pip install -r requirements.txtpython experiment/benchmark_mw.py \
--env_name door-close-v2-goal-observable \
--ckpt_dir ckpts/mw \
--milestone 4652204 \
--idm dp \
--idm_ckpt ckpts/mw/mw.ckpt \
--video_replan False \
--reality_replan False--env_name: Meta-World environment name.--ckpt_dir: Path to pretrained video policy checkpoints.--milestone: Specific milestone of the video policy to load.--idm: Type of inverse dynamics module.--idm_ckpt: Path to pretrained action module (IDM) checkpoint.--video_replan: Whether to replan during video generation.--reality_replan: Whether to replan during real execution.
Make sure you have the corresponding Meta-World checkpoints downloaded.
python experiment/benchmark_thor.py \
--scene FloorPlan1 \
--target Bread \
--ckpt_dir ckpts/thor \
--milestone 3003003 \
--ckpt ckpts/thor/thor.ckpt \Arguments
--scene: iTHOR scene name.--target: Target object to interact with.--ckpt_dir: Path to pretrained video policy checkpoints for iTHOR.--milestone: Specific milestone of the video policy to load.--idm_ckpt: Path to pretrained action module (IDM) checkpoint.
Make sure you have the corresponding iTHOR checkpoints downloaded.
We provide the pretrained model checkpoints used in our experiments, hosted on HuggingFace. You can directly download them via the following links:
Coming soon.
Download and put the .pt and .ckpt files in ckpts/[environment] folder. The resulting directory structure should be results/{mw, thor, real}/model-[x].pt and results/{mw, thor, real}/{mw, thor, real}.ckpt, for example results/mw/model-4652204.pt
bash download.sh metaworld
# bash download.sh thor
This codebase builds upon and is inspired by:
For questions or discussions, please open an issue or contact the authors via the project website.