Remove fairseq2 AdamW #1084

cbalioglu · 2025-03-17T17:23:45Z

This PR removes the AdamW optimizer from fairseq2. The main reason of having our own AdamW optimizer in fairseq2 (which was essentially using the same adamw functional API as the PyTorch version) was to parity check fairseq's memory efficient fp16 optimizer. Since nowadays AMP and FSDP mixed precision training have become ubiquitous with almost always better model accuracy and minimal impact on memory, there is no need to maintain our own version. (Beyond fp16 becoming more and more obsolete compared to bf16 and lower-precision data types)

Remove fairseq2 AdamW

6628722

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 17, 2025

cbalioglu merged commit 579e389 into main Mar 17, 2025
15 checks passed

cbalioglu deleted the adam branch March 17, 2025 17:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove fairseq2 AdamW #1084

Remove fairseq2 AdamW #1084

cbalioglu commented Mar 17, 2025

Remove fairseq2 AdamW #1084

Remove fairseq2 AdamW #1084

Conversation

cbalioglu commented Mar 17, 2025