v8.1.10: Lazy loading for CuPy kernels and additional CuPy and MPS improvements
✨ New features and improvements
- Implement
pad
as a CUDA kernel (#860). - Avoid h2d - d2h roundtrip when using
unflatten
(#861). - Improve exception when CuPy/PyTorch MPS is not installed (#863).
- Lazily load custom
cupy
kernels (#870).
🔴 Bug fixes
- Initially load TorchScript models on CPU for MPS devices (#864).
👥 Contributors
@adrianeboyd, @danieldk, @honnibal, @ines, @shadeMe, @svlandeg