Releases · JeffreyXiang/FlexGEMM

🎉 Initial Stable Release

FlexGEMM is a cross-platform backend for high-performance sparse convolutions, providing optimized Triton kernels with flexible autotuning for sparse 3D workloads.

This is the first stable public release of FlexGEMM.

Core Features

Efficient sparse submanifold convolution (forward & backward)
Grid sample with nearest or trilinear interpolation for sparse 3D tensors

Platform & Environment Support

OS
- Linux
- Windows
Python
- Python ≥ 3.8
Dependencies
- PyTorch ≥ 2.4.0
- Triton ≥ 3.2.0 (Linux)
- triton-windows ≥ 3.2.0 (Windows)

Tests & Benchmarks

Comprehensive test coverage for:
- Sparse convolution (forward / backward)
- Grid sample ops
- Hash map & neighbor cache
Benchmark scripts included for training & inference scenarios

Known Issues & Notes

This is the first stable release; while core APIs and functionality are stable, performance tuning and edge-case fixes will continue in upcoming 1.0.x releases.
Autotune cache behavior may evolve as more workloads are covered.
Users are encouraged to report platform-specific issues, especially on Windows.

Acknowledgements

We thank @PozzettiAndrea, @geekuillaume, and @DQSSSSS for their contributions to cross-platform support, autotuning infrastructure, and uint64 hashing in FlexGEMM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Core Features

Platform & Environment Support

Tests & Benchmarks

Known Issues & Notes

Acknowledgements

Contributors

Uh oh!

Releases: JeffreyXiang/FlexGEMM

FlexGEMM v1.0.0

Core Features

Platform & Environment Support

Tests & Benchmarks

Known Issues & Notes

Acknowledgements

Contributors

Uh oh!