ReMindView-Bench

This repository contains the data generation pipeline and evaluation scripts for ReMindView-Bench for paper Reasoning Path and Latent State Analysis for Multi-view Visual Spatial Reasoning

Project Overview

Paper: Reasoning Path and Latent State Analysis for Multi-view Visual Spatial Reasoning (arXiv:2512.02340).
Dataset: ReMindView-Bench hosted on Hugging Face at Xue0823/ReMindView-Bench; scripts here reproduce the synthetic data.
Figures in the repository mirror the organization of the generation pipeline (scene generation, rendering, QA construction).

Requirements

Blender: 3.6 LTS (bundled Python 3.10) or newer with bpy==3.6.0 (included in infinigen environment).
Python: 3.10 (matches Blender’s bundled interpreter).
GPU: NVIDIA GPU with at least 12 GB VRAM recommended for rendering.
OS: Linux or macOS; commands below assume Bash.
Python dependencies: see requirements.txt (mirrors environment.yml) with tested versions for reproducibility.

Environment Setup

Clone the repo and switch to the root directory.

Point to Blender’s bundled Python and install dependencies:

export BLENDER_ROOT=/path/to/blender-3.6
export BLENDER_BIN="$BLENDER_ROOT/blender"
export BLENDER_PYTHON="$BLENDER_ROOT/3.6/python/bin/python3.10"

"$BLENDER_PYTHON" -m pip install --upgrade pip
"$BLENDER_PYTHON" -m pip install -r requirements.txt

Ensure imports resolve when running through Blender by adding the repo to PYTHONPATH:
```
export PYTHONPATH="$PWD:$PYTHONPATH"
```

Example: run a script with Blender’s interpreter so bpy is available:

"$BLENDER_BIN" --background --python view_centric_view_frame_generation.py -- \
  --config view_centric_view_frame_generation_config.json \
  --output_dir view_centric_view_frame_outputs/Bedroom/Bedroom_0 \
  --room_type bedroom \
  --scene_path outputs/indoors/Bedroom_0/scene.blend

Paths above are the defaults produced by the generation script; command-line arguments stay exposed to let you override locations if you customize the pipeline.

Quickstart

Generate scenes and renders (object-centric and view-centric) from the repo root:
```
bash scene_generation.sh
```
Scenes land in outputs/indoors/<ROOM>_<SEED>, object-centric frames in object_centric_view_frame_outputs/<ROOM>/<ROOM>_<SEED>, and view-centric frames in view_centric_view_frame_outputs/<ROOM>/<ROOM>_<SEED>.

Clean empty or invalid views (already invoked inside scene_generation.sh, rerun if you tweak outputs):

python clean_visual_data.py --dir_path object_centric_view_frame_outputs
python clean_visual_data.py --dir_path view_centric_view_frame_outputs

Build QA CSVs (choose one of view_view, view_object, object_object):
```
python ground_truth_generation.py \
  --image_folder object_centric_view_frame_outputs \
  --qa_type object_object
```
Output is stored next to the image folder (e.g., object_centric_view_frame_outputs/object_object_qa.csv).
(Optional) Adjust outputs without editing scripts by passing flags (e.g., --output_dir, --scene_path). Defaults match the repo layout to minimize user error.

Pipeline Overview

Stage	Script(s)	Inputs	Outputs
Scene generation	`scene_generation.sh` (+ `infinigen_examples.generate_indoors`)	Seeds, room types, gin configs	Blender scenes under `outputs/indoors/<ROOM>_<SEED>`
View rendering	`object_centric_view_frame_generation.py`, `view_centric_view_frame_generation.py`	`scene.blend`, camera configs (`*_generation_config.json`)	Rendered frames and metadata under `object_centric_view_frame_outputs/` and `view_centric_view_frame_outputs/`
Visibility pruning	`clean_visual_data.py`	Rendered frame dirs	Pruned frame dirs without empty renders
Metadata extraction	`generate_object_list.py`	Rendered frame dirs	Object visibility summaries used for QA
VQA construction	`ground_truth_generation.py`, `ground_truth_generation.sh`	Visibility metadata, rename maps in `object_rename/`, `query_template.json`	Multiple-choice QA CSVs for view-view, view-object, and object-object tasks

Repository Map

ReMindView-Bench/
├─ scene_generation.sh                 # End-to-end loop over seeds/room types
├─ object_centric_view_frame_generation.py
├─ view_centric_view_frame_generation.py
├─ object_centric_view_frame_generation_config.json
├─ view_centric_view_frame_generation_config.json
├─ clean_visual_data.py
├─ generate_object_list.py
├─ ground_truth_generation.py
├─ ground_truth_generation.sh
├─ object_rename/                      # Rename/orientation metadata per room
├─ evaluation/                         # Model evaluation entrypoints (see notes)
├─ figures/                            # Paper figures used above
├─ environment.yml
└─ requirements.txt

The organization mirrors the paper sections (scene generation → rendering → QA). Files are listed once to avoid duplication between sections.

Evaluation

Evaluation entrypoints live in evaluation/ (eval_models.sh, eval_internvl35.py, eval_llava_onevision.py, eval_qwen_vl.py). Some evaluation code paths are intentionally left minimal; add checkpoints and dataset paths before use. Additional evaluation scripts will be added here as they are released.

Using the Generated Dataset

Download the dataset from Hugging Face or produce it locally via the pipeline above. Simple example to load QA pairs for a VQA task:

import pandas as pd

qa = pd.read_csv("object_centric_view_frame_outputs/object_object_qa.csv")
print(qa.head())

Each row includes the frame folder, query text, ground truth, and choices, so you can pair images from the same folder for VQA training or evaluation.

Citation

If you use ReMindView-Bench, please cite the accompanying paper:

@article{xue2025reasoning,
  title={Reasoning Path and Latent State Analysis for Multi-view Visual Spatial Reasoning: A Cognitive Science Perspective},
  author={Xue, Qiyao and Liu, Weichen and Wang, Shiqi and Wang, Haoming and Wu, Yuyang and Gao, Wei},
  journal={arXiv preprint arXiv:2512.02340},
  year={2025}
}

License

See LICENSE for usage terms. A permissive research-friendly license will be finalized; until then, reach out to the authors for questions about reuse.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ReMindView-Bench

Table of Contents

Project Overview

Requirements

Environment Setup

Quickstart

Pipeline Overview

Repository Map

Evaluation

Using the Generated Dataset

Citation

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
evaluation		evaluation
figures		figures
infinigen		infinigen
infinigen_examples		infinigen_examples
object_rename		object_rename
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
clean_visual_data.py		clean_visual_data.py
environment.yml		environment.yml
generate_object_list.py		generate_object_list.py
ground_truth_generation.py		ground_truth_generation.py
ground_truth_generation.sh		ground_truth_generation.sh
object_centric_view_frame_generation.py		object_centric_view_frame_generation.py
object_centric_view_frame_generation_config.json		object_centric_view_frame_generation_config.json
requirements.txt		requirements.txt
scene_generation.sh		scene_generation.sh
utils_ground_truth_generation.py		utils_ground_truth_generation.py
view_centric_view_frame_generation.py		view_centric_view_frame_generation.py
view_centric_view_frame_generation_config.json		view_centric_view_frame_generation_config.json

License

pittisl/ReMindView-Bench

Folders and files

Latest commit

History

Repository files navigation

ReMindView-Bench

Table of Contents

Project Overview

Requirements

Environment Setup

Quickstart

Pipeline Overview

Repository Map

Evaluation

Using the Generated Dataset

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages