Skip to content

KingaMas/chemeleon2

 
 

Repository files navigation

Chemeleon2

A reinforcement learning framework in latent diffusion models for crystal structure generation using group relative policy optimization.

Chemeleon2 logo

Overview

Chemeleon2 implements a three-stage pipeline for crystal structure generation:

  1. VAE Module: Encodes crystal structures into latent space representations
  2. LDM Module: Samples crystal structures in latent space using diffusion Transformer
  3. RL Module: Fine-tunes the LDM with reward functions

Chemeleon2 pipeline overview

Installation

# Clone the repository
git clone https://github.com/hspark1212/chemeleon2
cd chemeleon2

# Install dependencies with uv
uv sync

Tip: uv sync installs dependencies based on the uv.lock file, ensuring reproducible environments. If you encounter issues with uv.lock (e.g., lock file conflicts or compatibility problems), you can use uv pip install -e . as an alternative to install the package in editable mode directly from pyproject.toml.

(Optional) Installation with dependency

# (Optional) Install development dependencies (pytest, ruff, pyright, etc.)
uv sync --extra dev

# (Optional) Install metrics dependencies for evaluation (mace-torch, smact)
uv sync --extra metrics

(Optional) Pytorch Installation with CUDA

After completing uv sync, install a PyTorch version compatible with your CUDA environment to prevent compatibility issues. For version-specific installation commands, visit the PyTorch official website.

# (Optional) Example command for for PyTorch 2.7.0 with CUDA 12.8
uv pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128

Quick Start

For a simple walkthrough of sampling and evaluation, see tutorial.ipynb.

Training

Chemeleon2 uses a three-stage training pipeline: VAE → LDM → RL.

For detailed instructions, see:

Benchmarks

To benchmark de novo generation (DNG), 10,000 sampled structures are available in the benchmarks/dng/ directory:

The compressed json files can be load them using from monty.serialization import loadfn.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for detailed setup instructions, development workflow, and guidelines.

BibTex

@article{Park2025chemeleon2,
  title={Guiding Generative Models to Uncover Diverse and Novel Crystals via Reinforcement Learning},
  author={Hyunsoo Park and Aron Walsh},
  year={2025},
  url={https://arxiv.org/abs/2511.07158}
}

References

This work is inspired by the following projects:

  1. https://github.com/facebookresearch/DiT

  2. https://github.com/facebookresearch/all-atom-diffusion-transformer

  3. https://github.com/kvablack/ddpo-pytorch

  4. https://github.com/open-thought/tiny-grpo

License

Chemeleon2 is licensed under the MIT License. See LICENSE for more information.

About

A RL framework for Crystal Structure Generation using GRPO

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.5%
  • Shell 2.3%
  • Jupyter Notebook 2.2%