Skip to content

SSPolisetti/wl-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WL Learning

This repository implements reinforcement learning algorithms for training wheel-legged robots in MuJoCo simulation environments. It supports multiple RL algorithms including PPO (via RSL-RL), DDPG, and TD3, with distributed training capabilities for both local machines and HPC clusters. The project includes custom MuJoCo environments, training scripts, visualization tools, and data analysis utilities for studying locomotion behaviors.

Environment Setup

It is recommended to use mamba to create and manage the environment. Check the link to see how to setup the miniforge3, with mamba provided. Note: when prompted to setup bash, choose yes.

curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh
source ~/.bashrc

After installing and configuring mamba, run the following command:

# create virtual environment
mamba env create -n wl_learning python=3.12
mamba activate wl_learning

# install simulator
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
pip install warp-lang
pip install jax[cuda12]
pip install playground --extra-index-url=https://py.mujoco.org --extra-index-url=https://pypi.nvidia.com/warp-lang/
pip install imageio imageio-ffmpeg

# install alg and logging libs
pip install rsl-rl-lib==2.3.3 # have to anchor 2.3.x for compatibility
pip install stable-baselines3[extra]
pip install wandb tensorboard
pip install seaborn scikit-learn

Note: If you are testing locally on a machine without an Nvidia gpu, run the following commands instead of their counterparts above:

pip install torch torchvision
pip install jax
pip install playground

With all the depedencies installed, run editable installation on the wl_learning package:

pip install -e .

Training

For local training with PPO, use the train_rsl_rl.py script under scripts folder. An example is provided below.

python scripts/train_rsl_rl.py \
    --env_name WheelLegFlat \
    --num_envs 4096 \
    --use_wandb \
    --wandb_entity RoboRambler \
    --suffix PACE_Test \
    --camera track

For local training with DDPG, use the train_wl_ddpg.py script under the scripts folder. An example is provided below.

python scripts/train_rsl_rl.py \
    --env_name WheelLegFlat \
    --num_envs 200 \
    --use_wandb \
    --wandb_entity RoboRambler \
    --suffix PACE_Test \

For cluster training, set up your relatives dirs on PACE. Then use push_jobs.sh script under scripts folder. An example is provided below (replace <algo> with PPO or DDPG).

./scripts/push_job.sh <algo> \
    --env_name WheelLegFlat \
    --num_envs 8192 \
    --use_wandb \
    --wandb_entity RoboRambler \
    --suffix PACE_Test \
    --camera track \
    --video --video_interval 100

Mujoco Visualization

To view a Mujoco visualization of a model checkpoint from wandb for DDPG, use the play_wandb_ddpg.py script under the scripts folder. An example is provided below.

python scripts/play_wandb_ddpg.py \
--env_name WheelLegFlat \
--wandb_entity RoboRambler \
--wandb_project mjxrl_ddpg \
--wandb_runid y53v2yyc \

You can find the runid in the Overview section of the dashboard of a run in WandB.

Note: If you are on MacOS, use mjpython instead of python

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published