Skip to content

PlantPotatoOnMoon/SpatialPolicy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spatial Policy

The official codebase for training video policies in SpatialPolicy

This repository contains the code for training video policies presented in our work
Spatial Policy: Guiding Visuomotor Robotic Manipulation with Spatial-Aware Modeling and Reasoning

NEWS: We have released another repository for running our Meta-World and iTHOR experiments (https://github.com/PlantPotatoOnMoon/SP_exp)!

website | paper | arXiv | experiment repo

Framework

Getting started

We recommend to create a new environment with pytorch installed using conda.

conda create -n spatialpolicy python=3.9
conda activate spatialpolicy
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

Next, clone the repository and install the requirements

git clone https://github.com/PlantPotatoOnMoon/SpatialPolicy
cd SpatialPolicy
pip install -r requirements.txt

Dataset structure

The pytorch dataset classes are defined in flowdiffusion/datasets.py

Training models

For Meta-World experiments, run

cd flowdiffusion
python train_mw.py --mode train
# or python train_mw.py -m train

or run with accelerate

accelerate launch train_mw.py

For iTHOR experiments, run train_thor.py instead of train_mw.py
For real experiments, run train_real.py instead of train_mw.py
will upload soon

The trained model should be saved in ../results folder

To resume training, you can use -c --checkpoint_num argument.

# This will resume training with 1st checkpoint (should be named as model-1.pt)
python train_mw_feedback.py --mode train -c 1

Inferencing

Use the following arguments for inference
-p --inference_path: specify input video path
-t --text: specify the text discription of task
-n sample_steps Optional, the number of steps used in test time sampling. If the specified value less than 100, DDIM sampling will be used.
-g guidance_weight Optional, The weight used for classifier free guidance. Set to positive to turn on classifier free guidance.

For example:

python train_mw.py --mode inference -c 4652204 -p ../examples/assembly.gif -t assembly -g 2 -n 20

Pretrained models

We also provide checkpoints of the models described in our experiments as following.

Meta-World

SpatialPolicy/Meta-World

iThor

SpatialPolicy/iThor

Real

will upload soon

Acknowledgements

This codebase is modified from the following repositories:
avdc Videoagent

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages