v0.6.6: Patch release
What's Changed
- [docs] add 530b paper by @jeffra in #1979
- small fix for the HF Bert models by @RezaYazdaniAminabadi in #1984
- Add unit test for various model families and inference tasks by @mrwyattii in #1981
- Fix for lightning tests by @mrwyattii in #1988
- fix typo when getting kernel dim in conv calculation by @cli99 in #1989
- Add torch-latest and torch-nightly CI workflows by @mrwyattii in #1990
- [bug] Add user-defined launcher args for MPI launcher by @mrwyattii in #1933
- Propagate max errorcode to deepspeed when using PDSH launcher by @jerrymannil in #1994
- [docs] add new build badges to landing page by @jeffra in #1998
- DeepSpeed Comm. Backend v1 by @awan-10 in #1985
- Relax DeepSpeed MoE ZeRO-1 Assertion by @Quentin-Anthony in #2007
- update CODEOWNERS by @conglongli in #2017
- [CI] force upgrade HF dependencies & output py env by @jeffra in #2015
- [inference] test suite for ds-kernels (bert, roberta, gpt2, gpt-neo, gpt-j) by @jeffra in #1992
- DeepSpeed examples refresh by @jeffra in #2021
- Fix transformer API for training-evaluation pipeline by @RezaYazdaniAminabadi in #2018
- DataLoader Length Fix by @Sanger2000 in #1718
- DeepSpeed Monitor Module (Master) by @Quentin-Anthony in #2013
- Use partition numel by @tjruwase in #2011
- fix import errors by @KMFODA in #2026
- Fix inference unit test import error catching by @mrwyattii in #2024
- Retain available params until last use by @tjruwase in #2016
- Split parameter offload from z3 by @tjruwase in #2009
- Fix flops profiler print statements by @mrwyattii in #2038
- Add compression papers by @conglongli in #2042
- Fix the half-precision version of CPU-Adam by @RezaYazdaniAminabadi in #2032
- Fix for AMD unit tests by @mrwyattii in #2047
- Wrong partition_id while copying fp32_params -> fp16 params in Z2 for MoE by @siddharth9820 in #2058
- Fix missing import in replace_module.py by @aphedges in #2050
- Comms Benchmarks by @Quentin-Anthony in #2040
- add ds inference paper by @jeffra in #2072
- Comments for better understanding of zero stage1_2 by @kisseternity in #2027
- [docs] fix broken read-the-docs build by @jeffra in #2075
- Fix building package without a GPU by @aphedges in #2049
- Fix partition id in the fp32->fp16 param copying step for z2+cpu-offload by @siddharth9820 in #2059
- Codeowner addendum and fix to small model debugging script by @samadejacobs in #2076
- remove require grad in params count by @cli99 in #2065
- Add missing newline for ZeroOneAdam parameter table by @manuelciosici in #2088
- fixed "None type has no len()" by @xiazeyu in #2091
- Improving memory utilization of Z2+MoE by @siddharth9820 in #2079
New Contributors
- @jerrymannil made their first contribution in #1994
- @Sanger2000 made their first contribution in #1718
- @KMFODA made their first contribution in #2026
- @siddharth9820 made their first contribution in #2058
- @samadejacobs made their first contribution in #2076
- @xiazeyu made their first contribution in #2091
Full Changelog: v0.6.5...v0.6.6