ptn

Merge pull request #2198 from mees/fixVisualization#2197

Sep 7, 2017

64f0ead · Sep 7, 2017

Name	Name	Last commit message	Last commit date
parent directory ..
nets	nets	Added ptn directory	Jul 20, 2017
.gitignore	.gitignore	adds bazel workspace file to enable running perspective transformer n…	Aug 18, 2017
BUILD	BUILD	Added ptn directory	Jul 20, 2017
README.md	README.md	Added training details comment.	Jul 20, 2017
WORKSPACE	WORKSPACE	adds bazel workspace file to enable running perspective transformer n…	Aug 18, 2017
eval_ptn.py	eval_ptn.py	fixes #2236: non-existant flag was used to point to the dataset (#2238)	Aug 18, 2017
eval_rotator.py	eval_rotator.py	fixes #2236: non-existant flag was used to point to the dataset (#2238)	Aug 18, 2017
input_generator.py	input_generator.py	Added ptn directory	Jul 20, 2017
losses.py	losses.py	Added ptn directory	Jul 20, 2017
metrics.py	metrics.py	Added ptn directory	Jul 20, 2017
model_ptn.py	model_ptn.py	Added ptn directory	Jul 20, 2017
model_rotator.py	model_rotator.py	bugfix for missing function rotator_metrics (#2251)	Aug 18, 2017
model_voxel_generation.py	model_voxel_generation.py	Added ptn directory	Jul 20, 2017
pretrain_rotator.py	pretrain_rotator.py	Added ptn directory	Jul 20, 2017
train_ptn.py	train_ptn.py	Added ptn directory	Jul 20, 2017
utils.py	utils.py	Merge pull request #2198 from mees/fixVisualization#2197	Sep 7, 2017

README.md

Perspective Transformer Nets

Introduction

This is the TensorFlow implementation for the NIPS 2016 work "Perspective Transformer Nets: Learning Single-View 3D Object Reconstrution without 3D Supervision"

Re-implemented by Xinchen Yan, Arkanath Pathak, Jasmine Hsu, Honglak Lee

Reference: Orginal implementation in Torch

How to run this code

This implementation is ready to be run locally or "distributed across multiple machines/tasks". You will need to set the task number flag for each task when running in a distributed fashion. Please refer to the original paper for parameter explanations and training details.

Installation

TensorFlow
- This code requires the latest open-source TensorFlow that you will need to build manually. The documentation provides the steps required for that.
Bazel
- Follow the instructions here.
- Alternately, Download bazel from https://github.com/bazelbuild/bazel/releases for your system configuration.
- Check for the bazel version using this command: bazel version
matplotlib
- Follow the instructions here.
- You can use a package repository like pip.
scikit-image
- Follow the instructions here.
- You can use a package repository like pip.
PIL
- Install from here.

Dataset

This code requires the dataset to be in tfrecords format with the following features:

image
- Flattened list of image (float representations) for each view point.
mask
- Flattened list of image masks (float representations) for each view point.
vox
- Flattened list of voxels (float representations) for the object.
- This is needed for using vox loss and for prediction comparison.

You can download the ShapeNet Dataset in tfrecords format from here^*.

^* Disclaimer: This data is hosted personally by Arkanath Pathak for non-commercial research purposes. Please cite the ShapeNet paper in your works when using ShapeNet for non-commercial research purposes.

Pretraining: pretrain_rotator.py for each RNN step

$ bazel run -c opt :pretrain_rotator -- --step_size={} --init_model={}

Pass the init_model as the checkpoint path for the last step trained model. You'll also need to set the inp_dir flag to where your data resides.

Training: train_ptn.py with last pretrained model.

$ bazel run -c opt :train_ptn -- --init_model={}

Example TensorBoard Visualizations

To compare the visualizations make sure to set the model_name flag different for each parametric setting:

This code adds summaries for each loss. For instance, these are the losses we encountered in the distributed pretraining for ShapeNet Chair Dataset with 10 workers and 16 parameter servers:

You can expect such images after fine tuning the training as "grid_vis" under Image summaries in TensorBoard: Here the third and fifth columns are the predicted masks and voxels respectively, alongside their ground truth values.

A similar image for when trained on all ShapeNet Categories (Voxel visualizations might be skewed):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

ptn

ptn

README.md

Perspective Transformer Nets

Introduction