Skip to content

Latest commit

 

History

History
155 lines (120 loc) · 8.32 KB

README.md

File metadata and controls

155 lines (120 loc) · 8.32 KB

OpenLRM: Open-Source Large Reconstruction Models

Code License Weight License LRM

HF Models HF Demo

News

  • [2024.03.13] Update training code and release OpenLRM v1.1.1.
  • [2024.03.08] We have released the core blender script used to render Objaverse images.
  • [2024.03.05] The Huggingface demo now uses openlrm-mix-base-1.1 model by default. Please refer to the model card for details on the updated model architecture and training settings.
  • [2024.03.04] Version update v1.1. Release model weights trained on both Objaverse and MVImgNet. Codebase is majorly refactored for better usability and extensibility. Please refer to v1.1.0 for details.
  • [2024.01.09] Updated all v1.0 models trained on Objaverse. Please refer to HF Models and overwrite previous model weights.
  • [2023.12.21] Hugging Face Demo is online. Have a try!
  • [2023.12.20] Release weights of the base and large models trained on Objaverse.
  • [2023.12.20] We release this project OpenLRM, which is an open-source implementation of the paper LRM.

Setup

Installation

git clone https://github.com/3DTopia/OpenLRM.git
cd OpenLRM

Environment

Quick Start

Pretrained Models

  • Model weights are released on Hugging Face.
  • Weights will be downloaded automatically when you run the inference script for the first time.
  • Please be aware of the license before using the weights.
Model Training Data Layers Feat. Dim Trip. Dim. In. Res. Link
openlrm-obj-small-1.1 Objaverse 12 512 32 224 HF
openlrm-obj-base-1.1 Objaverse 12 768 48 336 HF
openlrm-obj-large-1.1 Objaverse 16 1024 80 448 HF
openlrm-mix-small-1.1 Objaverse + MVImgNet 12 512 32 224 HF
openlrm-mix-base-1.1 Objaverse + MVImgNet 12 768 48 336 HF
openlrm-mix-large-1.1 Objaverse + MVImgNet 16 1024 80 448 HF

Model cards with additional details can be found in model_card.md.

Prepare Images

  • We put some sample inputs under assets/sample_input, and you can quickly try them.
  • Prepare RGBA images or RGB images with white background (with some background removal tools, e.g., Rembg, Clipdrop).

Inference

  • Run the inference script to get 3D assets.

  • You may specify which form of output to generate by setting the flags EXPORT_VIDEO=true and EXPORT_MESH=true.

  • Please set default INFER_CONFIG according to the model you want to use. E.g., infer-b.yaml for base models and infer-s.yaml for small models.

  • An example usage is as follows:

    # Example usage
    EXPORT_VIDEO=true
    EXPORT_MESH=true
    INFER_CONFIG="./configs/infer-b.yaml"
    MODEL_NAME="zxhezexin/openlrm-mix-base-1.1"
    IMAGE_INPUT="./assets/sample_input/owl.png"
    
    python -m openlrm.launch infer.lrm --infer $INFER_CONFIG model_name=$MODEL_NAME image_input=$IMAGE_INPUT export_video=$EXPORT_VIDEO export_mesh=$EXPORT_MESH
    

Tips

  • The recommended PyTorch version is >=2.1. Code is developed and tested under PyTorch 2.1.2.
  • If you encounter CUDA OOM issues, please try to reduce the frame_size in the inference configs.
  • You should be able to see UserWarning: xFormers is available if xFormers is actually working.

Training

Configuration

  • We provide a sample accelerate config file under configs/accelerate-train.yaml, which defaults to use 8 GPUs with bf16 mixed precision.
  • You may modify the configuration file to fit your own environment.

Data Preparation

Run Training

  • A sample training config file is provided under configs/train-sample.yaml.

  • Please replace data related paths in the config file with your own paths and customize the training settings.

  • An example training usage is as follows:

    # Example usage
    ACC_CONFIG="./configs/accelerate-train.yaml"
    TRAIN_CONFIG="./configs/train-sample.yaml"
    
    accelerate launch --config_file $ACC_CONFIG -m openlrm.launch train.lrm --config $TRAIN_CONFIG
    

Inference on Trained Models

  • The inference pipeline is compatible with huggingface utilities for better convenience.

  • You need to convert the training checkpoint to inference models by running the following script.

    python scripts/convert_hf.py --config <YOUR_EXACT_TRAINING_CONFIG> convert.global_step=null
    
  • The converted model will be saved under exps/releases by default and can be used for inference following the inference guide.

Acknowledgement

  • We thank the authors of the original paper for their great work! Special thanks to Kai Zhang and Yicong Hong for assistance during the reproduction.
  • This project is supported by Shanghai AI Lab by providing the computing resources.
  • This project is advised by Ziwei Liu and Jiaya Jia.

Citation

If you find this work useful for your research, please consider citing:

@article{hong2023lrm,
  title={Lrm: Large reconstruction model for single image to 3d},
  author={Hong, Yicong and Zhang, Kai and Gu, Jiuxiang and Bi, Sai and Zhou, Yang and Liu, Difan and Liu, Feng and Sunkavalli, Kalyan and Bui, Trung and Tan, Hao},
  journal={arXiv preprint arXiv:2311.04400},
  year={2023}
}
@misc{openlrm,
  title = {OpenLRM: Open-Source Large Reconstruction Models},
  author = {Zexin He and Tengfei Wang},
  year = {2023},
  howpublished = {\url{https://github.com/3DTopia/OpenLRM}},
}

License