Skip to content

WIP feat: SMACT Property Prediction Module #448

@ryannduma

Description

@ryannduma

This issue tracks the development of a new property_prediction module within the SMACT codebase, utilising trained ROOST model checkpoints. The goal is to enable property prediction directly from chemical compositions, without requiring crystal structure information.

This work builds on efforts from the 2025 SMACTDOWN session and, once integrated, will allow users to compute properties such as band gap and bulk modulus via the SMACT interface.

The module includes a flexible scaffold designed to support additional prediction models in the future, ensuring long-term extensibility.

Module Structure

smact/property_prediction/
├── __init__.py              # Main module interface
├── base_predictor.py        # Abstract base class
├── wrapper.py           # Convenience functions (i.e. predict_bandgap(model, composition)) default Roost
├── roost/
│   ├── __init__.py         # ROOST module
│   └── predictor.py        # ROOST implementation
└── example_usage.py        # Example usage demonstration

Proposed Implementation

The proposed usage pattern would be:

from smact.property_prediction import RoostPropertyPredictor

# Check available properties
print(RoostPropertyPredictor.available_properties)  # ['band_gap', 'bulk_modulus']

# Create model with just property name and device
model = RoostPropertyPredictor(property_name="band_gap", device="cpu")

# Predict single or multiple compositions
model.predict("NaCl")                    # Single composition
model.predict(["NaCl", "TiO2", "GaN"])   # Multiple compositions

Key Features

  1. Class-level properties: RoostPropertyPredictor.available_properties
  2. SMACT validation: All compositions validated using smact_validity()
  3. Extensible design: Easy to add new property prediction model types
  4. Error handling: Clear error messages for invalid compositions/properties

Extensibility for Multiple Models

The architecture is designed for multiple models:

# Future implementations will work the same way:
from smact.property_prediction import OtherPropertyPredictor

other_model = OtherPropertyPredictor(property_name="bulk_modulus")

Abstract Base Class Architecture

In /smact/property_prediction/base_predictor.py:

from abc import ABC, abstractmethod

class BasePropertyPredictor(ABC):  # Abstract base class
    """Abstract base class for property predictors."""
    
    @property
    @abstractmethod
    def supported_properties(self) -> list[str]:  # Must be implemented
        """List of properties supported by this predictor."""
        pass
    
    @abstractmethod
    def predict(self, compositions: str | list[str]) -> np.ndarray:  # Must be implemented
        """Predict property values for given compositions."""
        pass

Concrete Implementation

The RoostPropertyPredictor inherits from and implements the abstract base class:

class RoostPropertyPredictor(BasePropertyPredictor):  # Concrete implementation
    
    @property
    def supported_properties(self) -> list[str]:  # Implements abstract method
        return self.available_properties
    
    def predict(self, compositions: str | list[str]) -> np.ndarray:  # Implements abstract method
        compositions = self._validate_compositions(compositions)
        # Implementation here...
        return np.zeros(len(compositions))  # Placeholder for now

Extensible Pattern

This allows for multiple concrete implementations:

# Future models will also inherit from BasePropertyPredictor
class OtherPropertyPredictor(BasePropertyPredictor):
    def supported_properties(self) -> list[str]:
        return ["band_gap", "formation_energy"]
    
    def predict(self, compositions: str | list[str]) -> np.ndarray:
        # Other-specific implementation
        pass

Current Status

The module is ready for integration with actual ROOST model checkpoints once training is complete.

Contributors

Hyunsoo Park
Jamie Swaine
Matthew Walker
Yuan Li
Zibo Zhou
Ry Nduma

Acknowledgments

With thanks to the broader SMACT community, and especially to participants of SMACTDOWN 2025 for their feedback and contributions.

Citations:

roost - Predicting materials properties without crystal structure: Deep representation learning from stoichiometry. [Paper] [arXiv]

@article{goodall_2020_predicting,
  title={Predicting materials properties without crystal structure: Deep representation learning from stoichiometry},
  author={Goodall, Rhys EA and Lee, Alpha A},
  journal={Nature Communications},
  volume={11},
  number={1},
  pages={1--9},
  year={2020},
  publisher={Nature Publishing Group}
}

Metadata

Metadata

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions