v3.7.4: New textcat layers and fo/nn language extensions
✨ New features and improvements
- Improve NumPy 2.0 compatibility (#13103).
- Added language extensions for Faroese and Norwegian Nynorsk (#13116).
- Add new
TextCatReduce.v1
layer for text classification (#13181). - Add new
TextCatParametricAttention.v1
layer for text classification (#13201). - Use
build
module for creating model packages by default (#13109). - Add support for code loading to the
benchmark speed
command (#13247). - Extend lexical attributes for English with more numericals (#13106).
- Warn about reloading dependencies after downloading models (#13081).
🔴 Bug fixes
- #13259, #13304, #13321: Correctness fixes for multiprocessing support in
Language.pipe
. - #13187: Typing and documentation fixes for
Doc
. - #13086: Update
Tokenizer.explain
for special cases with whitespace. - #13068: Fix displaCy span stacking.
- #13149: Add spacy.TextCatBOW.v3 to use the fixed
SparseLinear
layer.
📖 Documentation and examples
- Many improvements and updates to the LLM documentation.
- Update
trf_data
examples and the transformer pipeline design section.
👥 Contributors
@adrianeboyd, @danieldk, @evornov, @honnibal, @ines, @lise-brinck, @ridge-kimani, @rmitsch, @shadeMe, @svlandeg