ContactPrompt is a training-free and zero-shot approach for dense hand contact estimation with MLLMs by bridging the semantic reasoning capability of MLLMs and fine-grained 3D hand geometry reasoning.
- We recommend you to use an Anaconda virtual environment. Install PyTorch >=2.8.0 and Python >= 3.10.0. Our latest ContactPrompt model is tested on Python 3.10.19, PyTorch 2.8.0, CUDA 12.8.
- Setup the environment.
# Initialize conda environment
conda create -n contactprompt python=3.10 -y
conda activate contactprompt
# Install PyTorch
pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128
# Install all remaining packages
pip install -r requirements.txt
# Separately install networkx (ignore error message)
pip install networkx==2.8.8
You need to follow our directory structure of the data.
- For evaluation: See
docs/data_eval.md.
To evaluate ContactPrompt, please run:
python test.py --test_name {DATASET_NAME} --agent_model {AGENT_MODEL}
For example,
# GPT-5.5
python test.py --test_name MOW --agent_model gpt-5.5
# GPT-5.4
python test.py --test_name MOW --agent_model gpt-5.4
# Claude Opus 4.7
python test.py --test_name MOW --agent_model claude-opus-4.7
# Claude Sonnet 4.6
python test.py --test_name MOW --agent_model claude-sonnet-4.6
For the model output reported in our paper, please refer to huggingface dataset.
ImportError: cannot import name 'bool' from 'numpy': Please just comment out the linefrom numpy import bool, int, float, complex, object, unicode, str, nan, inf.
We thank:
- 3DAxisPormpt for the inspiration of fine-grained reasoning with MLLMs.
- HACO for dense hand contact estimation framework.
@article{jung2026contactprompt,
title = {Training-Free Dense Hand Contact Estimation with Multi-Modal Large Language Models},
author = {Jung, Daniel Sungho and Lee, Kyoung Mu},
journal = {arXiv preprint arXiv:2605.05886},
year = {2026}
}

