Version 0.12.0
We're pleased to announce a new skorch release, bringing new features that might interest you.
The main changes relate to better integration with the Hugging Face ecosystem:
- Benefit from faster training and inference times thanks to easy integration with accelerate via skorch's
AccelerateMixin
. - Better integration of tokenizers via skorch's
HuggingfaceTokenizer
andHuggingfacePretrainedTokenizer
; you can even put Hugging Face tokenizers into an sklearnPipeline
and perform a grid search to find the best tokenizer hyperparameters. - Automatically upload model checkpoints to Hugging Face Hub via skorch's
HfHubStorage
. - Check out this notebook to see how to use skorch and Hugging Face together.
But this is not all. We have added the possibility to load the best model parameters at the end of training when using the EarlyStopping
callback. We also added the possibility to remove unneeded attributes from the net after training when it is intended to be only used for prediction by calling the trim_for_prediction
method. Moreover, we now show how to use skorch with PyTorch Geometric in this notebook.
As always, this release was made possible by outside contributors. Many thanks to:
- Alan deLevie (@adelevie)
- Cédric Rommel (@cedricrommel)
- Florian Pinault (@floriankrb)
- @terminator-ger
- Timo Kaufmann (@timokau)
- @TrellixVulnTeam
Find below the list of all changes:
Added
- Added
load_best
attribute toEarlyStopping
callback to automatically load module weights of the best result at the end of training - Added a method,
trim_for_prediction
, on the net classes, which trims the net from everything not required for using it for prediction; call this after fitting to reduce the size of the net - Added experimental support for huggingface accelerate; use the provided mixin class to add advanced training capabilities provided by the accelerate library to skorch
- Add integration for Huggingface tokenizers; use
skorch.hf.HuggingfaceTokenizer
to train a Huggingface tokenizer on your custom data; useskorch.hf.HuggingfacePretrainedTokenizer
to load a pre-trained Huggingface tokenizer - Added support for creating model checkpoints on Hugging Face Hub using
HfHubStorage
- Added a notebook that shows how to use skorch with PyTorch Geometric (#863)
Changed
- The minimum required scikit-learn version has been bumped to 0.22.0
- Initialize data loaders for training and validation dataset once per fit call instead of once per epoch (migration guide)
- It is now possible to call
np.asarray
withSliceDataset
s (#858)
Fixed
- Fix a bug in
SliceDataset
that prevented it to be used withto_numpy
(#858) - Fix a bug that occurred when loading a net that has device set to None (#876)
- Fix a bug that in some cases could prevent loading a net that was trained with CUDA without CUDA
- Enable skorch to work on M1/M2 Apple MacBooks (#884)