26 Mar 02:59

Jintao-Huang

47a9b76

v3.2.2

中文版

新特性

Megatron-SWIFT发布。支持TP、PP、SP、CP等并行技术对Qwen系、Llama系、Deepseek-R1蒸馏系等100+模型进行预训练和微调。支持streaming数据集和序列packing功能支持超大数据集并提升训练效率。更多内容参考Megatron-SWIFT训练文档。
支持多轮GRPO训练以适配例如Deep Search等多轮agent工具调用场景，示例代码参考这里。
- 支持mini-batch，降低训练时的显存消耗。参考GRPO训练文档。
支持iic/gme-Qwen2-VL-2B-Instruct等多模态模型的Embedding训练。具体参考embedding模型训练文档。
支持大模型和多模态大模型的多标签分类和回归任务的训练到部署。示例脚本参考这里。
支持在训练过程中使用EvalScope对模型进行评测，及时了解模型的训练效果。示例脚本参考评测文档。
书写外置plugin，以支持多模态模型LoRA训练LLM的同时，全参数训练ViT，并采用不同的学习率。避免ViT部分merge-lora造成的精度误差。示例脚本参考这里。

新模型

iic/gme-Qwen2-VL-2B-Instruct系列
Qwen/Qwen2.5-VL-32B-Instruct
LLM-Research/gemma-3-4b-it系列
deepseek-ai/DeepSeek-V3-0324
mistralai/Mistral-Small-3.1-24B-Instruct-2503系列

English Version

New Features

Release of Megatron-SWIFT: Megatron-SWIFT has been released, supporting various parallel technologies such as TP (Tensor Parallelism), PP (Pipeline Parallelism), SP (Sequence Parallelism), and CP (Context Parallelism) for pre-training and fine-tuning over 100 models, including the Qwen series, Llama series, and Deepseek-R1 distillation series. It also supports streaming datasets and sequence packing, enabling the handling of ultra-large datasets while improving training efficiency. For more details, refer to the Megatron-SWIFT Training Documentation.
Support for Multi-turn GRPO Training: Supports multi-turn GRPO training to adapt to scenarios such as multi-turn agent tool calls in Deep Search. Example code can be found here.
- Supports mini-batch training to reduce GPU memory consumption during training. Refer to the GRPO Training Documentation.
Embedding Training for Multimodal Models: Supports embedding training for multimodal models such as iic/gme-Qwen2-VL-2B-Instruct. For more information, refer to the Embedding Model Training Documentation.
Multi-label Classification and Regression Tasks for Large Models and Multimodal Large Models: Supports end-to-end training and deployment for multi-label classification and regression tasks for large models and multimodal large models. Example scripts can be found here.
Model Evaluation with EvalScope During Training: Supports model evaluation using EvalScope during training to monitor training performance in real time. Example scripts can be found in the Evaluation Documentation.
Custom External Plugin for LoRA + ViT Training: Provides an external plugin to support LoRA training for LLMs (Large Language Models) while performing full-parameter training for ViTs (Vision Transformers) with different learning rates. This avoids precision errors caused by merging LoRA into the ViT portion. Example code can be found here.

New Models

iic/gme-Qwen2-VL-2B-Instruct series
Qwen/Qwen2.5-VL-32B-Instruct
LLM-Research/gemma-3-4b-it series
deepseek-ai/DeepSeek-V3-0324
mistralai/Mistral-Small-3.1-24B-Instruct-2503 series

What's Changed

update code doc by @hjh0119 in #3498
fix readme by @Jintao-Huang in #3499
feat: swanlab config add ms-swift by @Zeyi-Lin in #3500
Support GME models by @tastelikefeet in #3513
fix docs by @tastelikefeet in #3514
Fix docs links by @tastelikefeet in #3516
fix vllm memory leak by @hjh0119 in #3515
[Docs] Easy .[all] install from git by @xihuai18 in #3518
Fix bugs by @tastelikefeet in #3520
support megatron by @Jintao-Huang in #2885
fix megatron by @Jintao-Huang in #3527
support gemma3 by @hjh0119 in #3492
fix megatron pipeline parallel by @Jintao-Huang in #3529
fix megatron tie_weight by @Jintao-Huang in #3530
support megatron llama by @Jintao-Huang in #3532
Support megatron llama3.1 3.2 by @Jintao-Huang in #3537
更新LlavaHfTemplate以适配transformers版本大于4.47时对LLaVA和LLaVA-Next模型处理图像token逻辑的修改 by @zsxm1998 in #3521
refactor llava-hf by @Jintao-Huang in #3538
fix docs by @Jintao-Huang in #3539
refactor get_megatron_model_meta by @Jintao-Huang in #3542
Gather infonce loss and support hard negative samples by @tastelikefeet in #3548
fix docs by @tastelikefeet in #3553
fix unsloth by @tastelikefeet in #3554
fix grpo mllm split modules by @hjh0119 in #3552
grpo embedding layer lora by @hjh0119 in #3531
update arguments by @Jintao-Huang in #3556
update doc by @hjh0119 in #3557
Support all models' embedding and mask fake negative by @tastelikefeet in #3563
skip grpo first wake up by @hjh0119 in #3562
move grpovllmengine import by @hjh0119 in #3568
fix bugs & support dataset_name by @Jintao-Huang in #3565
fix wrap by @tastelikefeet in #3572
Feature: add train-eval loop by @Yunnglin in #3569
compat vllm>=0.8 by @Jintao-Huang in #3574
[grpo] Fix Incorrect Placement of Data in eval_queue During async_generate by @hjh0119 in #3573
Fix lmdeploy 0.7.3 by @tastelikefeet in #3584
support vit full llm lora by @Jintao-Huang in #3575
support Mistral3.1-2503 by @hjh0119 in #3588
Support megatron packing by @Jintao-Huang in #3595
[megatron] support streaming by @Jintao-Huang in #3609
fix rft by @lxline in #3602
[template] refactor replace media tokens by @Jintao-Huang in #3614
fix top_logprobs by @Jintao-Huang in #3616
Fix bugs by @Jintao-Huang in #3619
Support multi turn grpo by @tastelikefeet in #3615
fix grpo npu context by @hjh0119 in #3597
support regression multi-label by @Jintao-Huang in #3621
Support peft 0.15 by @tastelikefeet in #3623
update grpo warning by @hjh0119 in #3598
fix grpo rm zero3 by @hjh0119 in #3626
GRPO mini batch by @hjh0119 in #3205
fix grpo warning with pt backend by @hjh0119 in #3629
compat transformers 4.50 by @Jintao-Huang in #3625
support train_sampler_random by @Jintao-Huang in #3631
fix grpo multi turn by @tastelikefeet in #3632
update docs by @Jintao-Huang in #3633
Support deepseek v3 0324 by @Jintao-Huang in #3637
fix grpo cosine reward by @hjh0119 in #3638
fix grpo lora split module by @hjh0119 in #3635
fix reward model by @Jintao-Huang in #3641
support qwen2_5_vl_32b by @Jintao-Huang in #3642
fix grpo warning by @hjh0119 in #3630
grpo reset prefix cache by @...

Contributors

xihuai18, zsxm1998, and 6 other contributors

Assets 2

14 Mar 07:07

Jintao-Huang

v3.2.1

660e35a

v3.2.1

中文版

新特性

GRPO支持vLLM的tensor parallel模式。例子参考这里。
GRPO支持co-locate和optimizer和model的offload，支持分批次导入权重和合并LoRA，节约显存资源，使72B模型的训练可以在四张A100上运行。例子参考这里。
GRPO支持code ORM。最佳实践参考这里。

新模型

Qwen/QwQ-32B系列
inclusionAI/Ling-lite系列

New Features

GRPO supports the tensor parallel mode of vLLM. Examples can be found here.
GRPO supports co-locating offloading for both the optimizer and the model, allows for batch weight loading and LoRA merging, saving GPU memory resources, which enables training of a 72B model on four A100 GPUs. Examples can be found here.
GRPO supports code ORM. Best practices can be found here.

New Models

Qwen/QwQ-32B series
inclusionAI/Ling-lite series

What's Changed

Support vllm LLMEngine by @Jintao-Huang in #3370
update publish workflows by @Jintao-Huang in #3374
support ling by @Jintao-Huang in #3379
Support mp mode and hybrid mode of GRPO by @tastelikefeet in #3381
fix name by @tastelikefeet in #3382
fix web-ui infer by @Jintao-Huang in #3384
fix bugs by @tastelikefeet in #3385
fix bugs by @Jintao-Huang in #3386
support Qwen/QwQ-32B by @Jintao-Huang in #3388
support qwq-awq by @Jintao-Huang in #3391
support lmdeploy qwen2_5_vl by @Jintao-Huang in #3394
update infer_save by @Jintao-Huang in #3400
update requirements by @Jintao-Huang in #3403
fix ollama export by @Jintao-Huang in #3406
Fix grpo engine by @tastelikefeet in #3412
fix infer_stream by @Jintao-Huang in #3413
FIx some comments, add dlc script by @tastelikefeet in #3419
add comments and docs by @tastelikefeet in #3424
fix issue 1663 by @Jintao-Huang in #3417
Support GRPO model and optimizer offload, and split loading model by @tastelikefeet in #3427
update wechat by @tastelikefeet in #3430
Fix vllm random by @tastelikefeet in #3437
fix seed by @Jintao-Huang in #3438
fix_base_deploy by @Jintao-Huang in #3442
fix GRPO device mismatch by @hjh0119 in #3440
compat vllm==0.5.1 by @Jintao-Huang in #3444
fix grpo multimodal doc by @mi804 in #3449
support grpo code orm by @hjh0119 in #3431
fix GRPO seed by @Jintao-Huang in #3458
fix grpo multi nodes by @hjh0119 in #3462
Fix tensor parallel hang by @tastelikefeet in #3464
fix grpo trainer zero3 always gather parameters by @tcye in #3467
fix grpo temperature inconsistency by @hjh0119 in #3468
fix grad_norm nan by @Jintao-Huang in #3465
fix grad_norm by @Jintao-Huang in #3469
update minimax by @Jintao-Huang in #3471
Support 72b script with 4 gpus by @tastelikefeet in #3472
refactor packing by @Jintao-Huang in #3457
Fix some docs by @tastelikefeet in #3475
fix grpo ddp hang by @hjh0119 in #3476
fix moe quant by @Jintao-Huang in #3478
Delete duplicate parameters in train_72b_4gpu.sh by @Marquis03 in #3479
fix image by @tastelikefeet in #3480
fix infer gptq internvl2 by @Jintao-Huang in #3481
Resume sample by @BC-A in #3460
fix qwen2_vl flash_attn deepspeed by @Jintao-Huang in #3484
Fix seed of tp=1 by @tastelikefeet in #3486
fix use_cache by @Jintao-Huang in #3487
Fix qwen2 5 vl grounding by @Jintao-Huang in #3491
fix ovis2 device_map by @Jintao-Huang in #3496
fix template.decode by @Jintao-Huang in #3497

New Contributors

@tcye made their first contribution in #3467
@Marquis03 made their first contribution in #3479
@BC-A made their first contribution in #3460

Full Changelog: v3.2.0...v3.2.1

Contributors

tcye, mi804, and 5 other contributors

Assets 2

04 Mar 15:48

Jintao-Huang

v3.2.0

b3f9f6e

v3.2.0

中文版

新特性

GRPO支持多vLLM/lmdeploy数据并行采样，支持异步采样，参考这里。多模态GRPO实验记录参考这里。
swift deploy infer_backend为pt时支持动态batch；流式推理接口修改（break change）。
swift infer infer_backend为vllm/lmdeploy支持数据并行。参考这里。
支持moun优化器，参考这里。

新模型

moonshotai/Moonlight-16B-A3B-Instruct
LLM-Research/Phi-4-mini-instruct, LLM-Research/Phi-4-multimodal-instruct
DeepSeek-V3-awq, deepseek-r1-awq
Baichuan-M1-14B-Instruct

新数据集

多模态GRPO：
- lmms-lab/multimodal-open-r1-8k-verified
- okwinds/clevr_cogen_a_train

New Features

GRPO supports multi-vLLM/lmdeploy data parallel sampling and asynchronous sampling. For more information, refer to here. Records of multi-modal GRPO experiments can be found here.
When swift deploy infer_backend is set to pt, it supports dynamic batching; the streaming inference interface has been modified (breaking change).
When swift infer infer_backend is set to vllm/lmdeploy, it supports data parallelism. Refer to here.
Supports the muon optimizer. For more information, refer to here.

New Models

moonshotai/Moonlight-16B-A3B-Instruct
LLM-Research/Phi-4-mini-instruct, LLM-Research/Phi-4-multimodal-instruct
DeepSeek-V3-awq, deepseek-r1-awq
Baichuan-M1-14B-Instruct

New Datasets

Multi-modal GRPO:
- lmms-lab/multimodal-open-r1-8k-verified
- okwinds/clevr_cogen_a_train

What's Changed

fix setup.py by @Jintao-Huang in #3198
support vllm dp by @Jintao-Huang in #3201
update dataset & fix bugs by @Jintao-Huang in #3203
Support multiple vllms by @tastelikefeet in #3202
update distill docs by @tastelikefeet in #3216
compatible with trl0.16 by @hjh0119 in #3209
support r1 awq by @Jintao-Huang in #3206
fix grpo old_per_token_logps by @hjh0119 in #3220
Support the generation of JanusPro models by @DaozeZhang in #3218
Update the JanusPro-generation by @DaozeZhang in #3221
fix load args by @Jintao-Huang in #3226
update docs by @Jintao-Huang in #3230
Speed up GRPO by @tastelikefeet in #3229
fix docs zh by @Jintao-Huang in #3231
fix deepseek_vl2 by @Jintao-Huang in #3233
support moonlight by @Jintao-Huang in #3232
support muon optimizer by @Jintao-Huang in #3234
update docs by @Jintao-Huang in #3243
fix grpo npu vllm by @hjh0119 in #3242
fix grpo single card by @tastelikefeet in #3246
save val_dataset by @Jintao-Huang in #3248
fix grpo compat transformers==4.47.* by @Jintao-Huang in #3252
grpo_countdown & fix format reward by @mi804 in #3269
Support the base64 format of generated images for JanusPro by @DaozeZhang in #3265
Fix typos by @co63oc in #3266
compat lmdeploy 0.7 by @Jintao-Huang in #3256
fix lmdeploy by @Jintao-Huang in #3274
GRPO+LMDeploy 0.7 by @tastelikefeet in #3277
Support max memory by @Jintao-Huang in #3282
add lmdeploy dp shell by @Jintao-Huang in #3284
Support Baichuan-M1-14B-Instruct by @DaozeZhang in #3271
fix grpo top_k by @Jintao-Huang in #3293
fix lmdeploy mllm in grpo by @tastelikefeet in #3296
Update FAQ by @slin000111 in #3289
fix: error when uploading model to huggingface by @xavier-h-10 in #3297
add multimodal clevr exp by @mi804 in #3301
update docs by @Jintao-Huang in #3304
[refactor] patch_vllm by @Jintao-Huang in #3306
GRPO mllm script by @hjh0119 in #3305
[refactor & feat] support pt dynamic batch by @Jintao-Huang in #3278
Support ZeRO++ by @tastelikefeet in #3315
Revert pt engine batch infer by @Jintao-Huang in #3316
optimize model_type by @Jintao-Huang in #3318
Fix bugs & Update docs/datasets by @Jintao-Huang in #3322
fix grpo zero3 by @hjh0119 in #3324
fix grpo zero3 by @hjh0119 in #3326
compat vllm>=0.5.1 lmdeploy>=0.5.0 by @Jintao-Huang in #3332
update external plugins by @Jintao-Huang in #3334
fix generation_config by @Jintao-Huang in #3335
fix check_model error by @Jintao-Huang in #3336
update get_model_tokenizer_with_flash_attn by @Jintao-Huang in #3337
add geoqa grpo experiment by @mi804 in #3344
fix max_memory by @Jintao-Huang in #3347
support phi4-multimodal by @Jintao-Huang in #3350
fix：fix bugs in cosine reward of GRPO by @youyc22 in #3358
Remove entry including invalid ROADMAP link from English & Chinese documentation by @3manifold in #3357
update docs by @Jintao-Huang in #3349
Support the
update docs by @Jintao-Huang in #3365
add grpo openr1 multimodal experiment by @mi804 in #3368
fix swift app format by @Jintao-Huang in #3367

New Contributors

@xavier-h-10 made their first contribution in #3297
@youyc22 made their first contribution in #3358
@3manifold made their first contribution in #3357

Full Changelog: v3.1.1...v3.2.0

Contributors

co63oc, 3manifold, and 8 other contributors

Assets 2

20 Feb 06:31

Jintao-Huang

v3.1.1

9e8c2d4

v3.1.1

中文版

新特性

支持大模型、多模态模型、Agent、多节点GRPO训练，参考这里。
支持Embeding模型训练，参考这里。
swift sample支持MCTS、蒸馏方式数据采样，支持多模态模型采样。
支持自定义数据集评测，参考这里。

新模型

AIDC-AI/Ovis2-2B系列
Qwen/Qwen2.5-VL-72B-Instruct-AWQ系列
stepfun-ai/GOT-OCR-2.0-hf
stepfun-ai/Step-Audio-Chat
mistralai/Mistral-Small-24B-Instruct-2501

新数据集

GRPO相关
- AI-ModelScope/MATH-lighteval
- LLM-Research/xlam-function-calling-60k
- AI-MO/NuminaMath-TIR
R1相关
- liucong/Chinese-DeepSeek-R1-Distill-data-110k-SFT
- modelscope/MathR, modelscope/MathR-32B-Distill

New Features

Support for large models, multimodal models, Agents, and multi-node GRPO training. Refer to this documentation.
Support for Embedding model training. Refer to this script.
swift sample supports MCTS and distillation data sampling, as well as multimodal model sampling.
Support for custom dataset evaluation. Refer to this documentation.

New Models

AIDC-AI/Ovis2-2B series
Qwen/Qwen2.5-VL-72B-Instruct-AWQ series
stepfun-ai/GOT-OCR-2.0-hf
stepfun-ai/Step-Audio-Chat
mistralai/Mistral-Small-24B-Instruct-2501

New Datasets

Related to GRPO
- AI-ModelScope/MATH-lighteval
- LLM-Research/xlam-function-calling-60k
- AI-MO/NuminaMath-TIR
Related to R1
- liucong/Chinese-DeepSeek-R1-Distill-data-110k-SFT
- modelscope/MathR, modelscope/MathR-32B-Distill

What's Changed

Add evalscope native backend by @Yunnglin in #2981
support mistralai/Mistral-Small-24B-Instruct-2501 by @Jintao-Huang in #3030
MCTS Sampler by @lxline in #2967
fix windows url by @Jintao-Huang in #3041
Support sample multi modal models by @tastelikefeet in #3048
Support sft embedding model by @tastelikefeet in #3039
support GRPO by @hjh0119 in #3022
fix grpo by @hjh0119 in #3050
fix grpo by @Jintao-Huang in #3051
update docs (fine-tuning) by @Jintao-Huang in #3052
bump version by @Jintao-Huang in #3053
fix grpo model_type by @Jintao-Huang in #3057
update rlhf documents by @hjh0119 in #3055
add grpo multinode scripts by @hjh0119 in #3059
Fix orm env by @tastelikefeet in #3065
Support external plugins by @tastelikefeet in #3066
update docs by @Jintao-Huang in #3070
fix grpo nan by @Jintao-Huang in #3075
fix grpo metric_for_best_model by @Jintao-Huang in #3077
register MathR by @mi804 in #3078
fix accuracy reward by @hjh0119 in #3080
fix SwiftModel by @Jintao-Huang in #3071
Fix grpo vlm (internvl2.5) by @Jintao-Huang in #3081
Refactor orm prm by @Jintao-Huang in #3085
fix competition math by @tastelikefeet in #3086
support cuda operations to npu by @tastelikefeet in #3087
fix grpo temperature 0.7->0.9 by @Jintao-Huang in #3091
support grpo vllm lora by @Jintao-Huang in #3095
Feat: Eval custom dataset by @Yunnglin in #3093
cosine and repetition reward for GRPO by @hjh0119 in #3079
fix get_device by @Jintao-Huang in #3097
Fix/grpo by @MrToy in #3101
fix unsloth by @tastelikefeet in #3100
support grpo npu by @Jintao-Huang in #3102
fix grpo zero3 by @Jintao-Huang in #3104
support log completions by @Jintao-Huang in #3110
Fix typos by @co63oc in #3111
update trl version by @Jintao-Huang in #3117
fix eval docs by @Jintao-Huang in #3118
Support llamapro for grpo by @tastelikefeet in #3119
fix grpo trainer by @Jintao-Huang in #3120
fix cleanup error by @Jintao-Huang in #3121
Fix typos by @co63oc in #3123
refactor patcher by @Jintao-Huang in #3124
Support lmdeploy in GRPO by @tastelikefeet in #3126
support stepfun-ai/Step-Audio-Chat by @Jintao-Huang in #3127
update docs by @Jintao-Huang in #3131
fix grpo pt infer generation_config by @Jintao-Huang in #3135
support_local_path by @Jintao-Huang in #3140
Support swanlab by @tastelikefeet in #3142
fix grpo sample by @MrToy in #3144
fix grpo vllm lora by @Jintao-Huang in #3134
fix create_repo by @tastelikefeet in #3147
fix grpo zero3 by @Jintao-Huang in #3149
docs: report_to add swanlab by @Zeyi-Lin in #3158
Support Ovis2 models by @DaozeZhang in #3163
support grpo metric_for_best_model by @Jintao-Huang in #3155
Fix ovis2 by @Jintao-Huang in #3169
Support Agent GRPO by @tastelikefeet in #3170
fix max_length error by @Jintao-Huang in #3173
fix streaming by @Jintao-Huang in #3176
Fix/agent grpo by @tastelikefeet in #3172
Fix lmdeploy branch by @tastelikefeet in #3145
fix internvl-4b by @Jintao-Huang in #3178
refactor cosine orm by @Jintao-Huang in #3179
fix sampler reaches max_length by @tastelikefeet in #3180
Fix prm in sampler by @tastelikefeet in #3184
Support GOT_OCR2_hf by @DaozeZhang in #3182
Knowledge Distillation sampling by @mi804 in #3185
compat vllm==0.7.2 by @Jintao-Huang in #3083
support r1 dataset by @Jintao-Huang in #3191
Refactor grpo dataset by @Jintao-Huang in #3192
Add links to agent grpo by @tastelikefeet in #3193

New Contributors

@MrToy made their first contribution in #3101
@co63oc made their first contribution in #3111
@Zeyi-Lin made their first contribution in #3158

Full Changelog: v3.1.0...v3.1.1

Contributors

co63oc, MrToy, and 8 other contributors

Assets 2

07 Feb 12:38

Jintao-Huang

v3.1.0

adb6f8f

v3.1.0

中文版

新特性

支持swift sample命令进行数据采样，参考这里。
支持强化微调训练，目前已支持拒绝采样微调，参考这里。
Grounding任务自定义数据格式重构，参考这里。
swift infer支持输出推理速度和ACC/ROUGE/BLEU指标。

新模型

Qwen/Qwen2.5-VL-3B-Instruct系列
Qwen/Qwen2.5-7B-Instruct-1M系列
deepseek-ai/Janus-Pro-1B系列
bytedance-research/UI-TARS-2B-SFT系列

新数据集

ServiceNow-AI/R1-Distill-SFT
bespokelabs/Bespoke-Stratos-17k
open-thoughts/OpenThoughts-114k

English Version

New Features

Supports the swift sample command for data sampling; refer to here.
Supports reinforcement fine-tuning training, with current support for rejection sampling fine-tuning; refer to here.
3Grounding task custom data format restructuring; refer to here.
swift infer supports outputting inference speed and ACC/ROUGE/BLEU metrics.

New Models

Qwen/Qwen2.5-VL-3B-Instruct Series
Qwen/Qwen2.5-7B-Instruct-1M Series
deepseek-ai/Janus-Pro-1B Series
bytedance-research/UI-TARS-2B-SFT Series

New Datasets

ServiceNow-AI/R1-Distill-SFT
bespokelabs/Bespoke-Stratos-17k
open-thoughts/OpenThoughts-114k

What's Changed

add "enable_prefix_caching" args for vllm engine. by @Leoyzen in #2939
Fix vllm docs link & fix web-ui by @Jintao-Huang in #2970
Fix sample by @tastelikefeet in #2971
support merge-lora & quant by @Jintao-Huang in #2973
support create_checkpoint_symlink by @Jintao-Huang in #2975
Sampling and RFT by @tastelikefeet in #2977
support auto dataset mapping by @Jintao-Huang in #2976
support qwen2_5 long by @Jintao-Huang in #2982
sys_prompt from file by @lxline in #2980
support bytedance-research/UI-TARS-2B-SFT series by @Jintao-Huang in #2987
support Qwen/Qwen2.5-VL-3B-Instruct series model by @Jintao-Huang in #2996
fix qwen2_5-vl by @Jintao-Huang in #2998
support Qwen/Qwen2.5-VL-72B-Instruct by @Jintao-Huang in #2999
refactor grounding by @Jintao-Huang in #3000
compatible with trl v0.13 by @hjh0119 in #2992
update R1 dataset by @Jintao-Huang in #3005
fix qwen2.5-vl grounding (refactor) by @Jintao-Huang in #2979
fix deploy by @Jintao-Huang in #3007
support infer metric: acc/rouge or bleu by @Jintao-Huang in #3008
support deepseek janus pro by @Jintao-Huang in #3009
update readme by @Jintao-Huang in #3011
fix parse_dict by @Jintao-Huang in #3012
update docs by @Jintao-Huang in #3015
Fix readme & update docs by @Jintao-Huang in #3018
fix push to hub by @tastelikefeet in #3024
Fix bugs by @Jintao-Huang in #3025
fix bugs by @Jintao-Huang in #3026
Fix qwen tool template to official format by @Leoyzen in #2988
fix message merging strategy when multi-turn tool calling. by @Leoyzen in #2986

New Contributors

@Leoyzen made their first contribution in #2939

Full Changelog: v3.0.3...v3.1.0

Contributors

Leoyzen, Jintao-Huang, and 3 other contributors

Assets 2

22 Jan 15:27

Jintao-Huang

v3.0.3

1af2484

v3.0.3

中文版

新特性

支持多模态大模型SequenceClassification架构用于多模态分类任务，参考这里。
支持多模态大模型reward model训练。

新模型

Shanghai_AI_Laboratory/internlm3-8b-instruct
OpenBMB/MiniCPM-o-2_6
deepseek-ai/DeepSeek-R1, deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B系列
bytedance-research/Valley-Eagle-7B
LLM-Research/phi-4
Qwen/Qwen2.5-Math-PRM-7B, Qwen/Qwen2.5-Math-PRM-72B
MiniMaxAI/MiniMax-Text-01, MiniMaxAI/MiniMax-VL-01

English Version

New Features

Support multi-modal large model SequenceClassification architecture for multi-modal classification tasks, see here.
Support training of multi-modal reward model.

New Models

Shanghai_AI_Laboratory/internlm3-8b-instruct
OpenBMB/MiniCPM-o-2_6
deepseek-ai/DeepSeek-R1, deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B series
bytedance-research/Valley-Eagle-7B
LLM-Research/phi-4
Qwen/Qwen2.5-Math-PRM-7B, Qwen/Qwen2.5-Math-PRM-72B
MiniMaxAI/MiniMax-Text-01, MiniMaxAI/MiniMax-VL-01

What's Changed

update qlora shell by @Jintao-Huang in #2880
fix docs by @Jintao-Huang in #2882
support multi round dpo by @tastelikefeet in #2884
Support infer n parameter by @tastelikefeet in #2893
Fix qwen vl eval by @Jintao-Huang in #2892
fix infer engine by @Jintao-Huang in #2898
Add phi4 by @tastelikefeet in #2895
fix link & bug by @Jintao-Huang in #2902
update video infer examples by @Jintao-Huang in #2840
Sampler by @tastelikefeet in #2905
Fix a bug when lint code by @tastelikefeet in #2906
Fix bugs by @Jintao-Huang in #2907
update plugin doc by @tastelikefeet in #2908
fix vllm tp stuck by @Jintao-Huang in #2909
fix replace_video2image by @Jintao-Huang in #2913
Fix read file mode by @tastelikefeet in #2915
fix inspect init by @Jintao-Huang in #2916
Update rm by @tastelikefeet in #2919
Add internlm3 dense by @HIT-cwh in #2920
internlm3 lint pass by @Jintao-Huang in #2923
Fix web ui log by @tastelikefeet in #2924
Support Valley by @lxline in #2921
support minicpm-o by @Jintao-Huang in #2918
fix vllm tp block by @Jintao-Huang in #2927
update docs by @Jintao-Huang in #2929
Support first prms by @tastelikefeet in #2926
fix Valley by @lxline in #2931
Support mllm seq_cls/rm by @Jintao-Huang in #2934
fix bugs by @Jintao-Huang in #2938
support deepseek-ai/DeepSeek-R1 by @Jintao-Huang in #2940
Fix quant template by @Jintao-Huang in #2942
Support minimax by @tastelikefeet in #2943
Fix mllm seq cls by @Jintao-Huang in #2945
support deepseek_r1_distill by @Jintao-Huang in #2946
fix demo_hf by @Jintao-Huang in #2951
fix infer_stream by @Jintao-Huang in #2952
fix citest by @Jintao-Huang in #2953
fix bugs by @Jintao-Huang in #2954
update requirements by @Jintao-Huang in #2957
update web-ui images by @tastelikefeet in #2958
update quant_mllm shell by @Jintao-Huang in #2959
fix max_length error print by @Jintao-Huang in #2960
fix seq_cls patcher by @Jintao-Huang in #2963
ppo compat transformers>=4.47.* by @Jintao-Huang in #2964

Full Changelog: v3.0.2...v3.0.3

Contributors

HIT-cwh, Jintao-Huang, and 2 other contributors

Assets 2

07 Jan 15:28

Jintao-Huang

v3.0.2

a4f173a

v3.0.2

中文版

新特性

支持使用swift app开启可视化推理创空间，参考这里
支持大模型的RM和PPO训练，参考这里
支持SequenceClassification模型（含BERT）的BNB/GPTQ量化，参考这里
支持reward model的推理、部署和BNB/GPTQ量化

新模型

ZhipuAI/cogagent-9b-20241220
Reward Models: Shanghai_AI_Laboratory/internlm2-1_8b-reward系列, Qwen/Qwen2-Math-RM-72B系列, AI-ModelScope/Skywork-Reward-Llama-3.1-8B系列, AI-ModelScope/GRM_Llama3.1_8B_rewardmodel-ft系列
AIDC-AI/Ovis1.6-Gemma2-27B, AIDC-AI/Ovis1.6-Llama3.2-3B
PowerInfer/SmallThinker-3B-Preview

新数据集

PowerInfer/LONGCOT-Refine-500K, PowerInfer/QWQ-LONGCOT-500K

English Version

New Features

Support for using swift app to launch a visual inference creative space, see here
Support for RM and PPO training of large models, see here
Support for BNB/GPTQ quantization of SequenceClassification models (including BERT), see here
Support for inference, deployment, and BNB/GPTQ quantization of reward models

New Models

ZhipuAI/cogagent-9b-20241220
Reward Models: Shanghai_AI_Laboratory/internlm2-1_8b-reward series, Qwen/Qwen2-Math-RM-72B series, AI-ModelScope/Skywork-Reward-Llama-3.1-8B series, AI-ModelScope/GRM_Llama3.1_8B_rewardmodel-ft series
AIDC-AI/Ovis1.6-Gemma2-27B, AIDC-AI/Ovis1.6-Llama3.2-3B
PowerInfer/SmallThinker-3B-Preview

New Datasets

PowerInfer/LONGCOT-Refine-500K, PowerInfer/QWQ-LONGCOT-500K

What's Changed

Fix app-ui dropdown by @tastelikefeet in #2787
fix multi-lora by @Jintao-Huang in #2790
fix stream infer by @Jintao-Huang in #2793
fix some web-ui bugs by @tastelikefeet in #2794
support swift app by @Jintao-Huang in #2792
fix pt batch infer by @Jintao-Huang in #2800
fix world_size by @Jintao-Huang in #2801
update base_model deploy example by @Jintao-Huang in #2803
fix glm4v by @Jintao-Huang in #2806
fix swift deploy log error (repeat log) by @Jintao-Huang in #2808
support ZhipuAI/cogagent-9b-20241220 by @Jintao-Huang in #2810
fix citest by @Jintao-Huang in #2812
fix enable_cache by @Jintao-Huang in #2813
update docs (specific model arguments) by @Jintao-Huang in #2822
add 'right' option for 'truncation_strategy' by @zsxm1998 in #2754
Fix glm4v suffix by @Jintao-Huang in #2829
Update padding side by @Jintao-Huang in #2832
Update base_to_chat shell by @Jintao-Huang in #2833
Fix bugs by @Jintao-Huang in #2838
Fix some bugs by @tastelikefeet in #2848
support reward_model by @Jintao-Huang in #2849
Move optimizer to create_optimizer by @tastelikefeet in #2851
fix post_init by @Jintao-Huang in #2855
fix cache_name_file by @Jintao-Huang in #2856
fix telechat template by @Jintao-Huang in #2857
Update more models by @Jintao-Huang in #2852
Support quant bert reward by @Jintao-Huang in #2859
fix jsonl writer by @Jintao-Huang in #2860
support reward model train by @Jintao-Huang in #2862
fix vllm video by @Jintao-Huang in #2864
support mps by @Jintao-Huang in #2866
Update agent demo by @Jintao-Huang in #2867
fix bugs by @Jintao-Huang in #2869
Support ppo by @Jintao-Huang in #2783
update citest by @Jintao-Huang in #2873
fix dataset cache bugs by @Jintao-Huang in #2876

New Contributors

@zsxm1998 made their first contribution in #2754

Full Changelog: v3.0.1...v3.0.2

Contributors

zsxm1998, Jintao-Huang, and tastelikefeet

Assets 2

27 Dec 03:45

Jintao-Huang

v3.0.1

4dac876

v3.0.1

中文版

新特性

支持SequenceClassification模型的训练、推理和部署。可以查看以下例子：qwen2.5，bert。
LlamaPro支持多模态模型. 例如：qwen2vl、internvl2.5、llama3-vision等。

新模型

Qwen/QVQ-72B-Preview
iic/DocOwl2
OpenGVLab/InternVL2-Pretrain-Models, OpenGVLab/InternVL2_5-4B-AWQ系列, OpenGVLab/InternVL2_5-1B-MPO系列
deepseek-ai/DeepSeek-V3系列
answerdotai/ModernBERT-base系列
AI-ModelScope/paligemma2-3b-pt-224系列, AI-ModelScope/paligemma2-3b-ft-docci-448系列
AI-ModelScope/Skywork-o1-Open-Llama-3.1-8B

English Version

New Features:

Support for training, inference, and deployment of SequenceClassification models. You can check the following examples: qwen2.5, bert.
LlamaPro supports multimodal models, such as qwen2vl, internvl2.5, and llama3-vision.

New Models:

Qwen/QVQ-72B-Preview
iic/DocOwl2
OpenGVLab/InternVL2-Pretrain-Models, OpenGVLab/InternVL2_5-4B-AWQ series, OpenGVLab/InternVL2_5-1B-MPO series
deepseek-ai/DeepSeek-V3 series
answerdotai/ModernBERT-base series
AI-ModelScope/paligemma2-3b-pt-224 series, AI-ModelScope/paligemma2-3b-ft-docci-448 series
AI-ModelScope/Skywork-o1-Open-Llama-3.1-8B

What's Changed

Fix mplug owl2, molmo by @Jintao-Huang in #2724
fix batch_infer pad_token & florence by @Jintao-Huang in #2725
Support qwen agent format by @tastelikefeet in #2722
Support more internvl2.5 awq/mpo & internvl2 pretrain model by @Jintao-Huang in #2726
support iic/DocOwl2 by @Jintao-Huang in #2728
update examples by @Jintao-Huang in #2730
remove files by @Jintao-Huang in #2732
support paligemma2 by @Jintao-Huang in #2735
fix windows by @Jintao-Huang in #2733
support multi-modal llamapro by @tastelikefeet in #2738
support AI-ModelScope/Skywork-o1-Open-Llama-3.1-8B by @Jintao-Huang in #2739
Fix windows encoding gbk by @Jintao-Huang in #2741
fix docs multimodal by @Jintao-Huang in #2742
support SequenceClassification & update QVQ-72B-Preview by @Jintao-Huang in #2747
fix web-ui by @Jintao-Huang in #2758
fix bugs by @Jintao-Huang in #2761
fix shell by @Jintao-Huang in #2764
fix app-ui by @tastelikefeet in #2765
support modern_bert & support bert deploy by @Jintao-Huang in #2767
fix alpaca by @Jintao-Huang in #2771
support txt by @Jintao-Huang in #2772
fix telechat2 template by @Jintao-Huang in #2775
Fix deepspeed by @Jintao-Huang in #2778
fix qwen2vl by @Jintao-Huang in #2779
Fix app ui by @tastelikefeet in #2780
support deepseek-v3 by @Jintao-Huang in #2781
Fix app-ui by @tastelikefeet in #2784

Full Changelog: v3.0.0...v3.0.1

Contributors

Jintao-Huang and tastelikefeet

Assets 2

23 Dec 03:17

Jintao-Huang

v3.0.0

6a34e96

v3.0.0

中文版

架构修改与新特性：

具体可以查看这里: https://swift.readthedocs.io/zh-cn/latest/Instruction/ReleaseNote3.0.html

新模型：

OpenGVLab/InternVL2_5-1B等系列模型
LLM-Research/Llama-3.3-70B-Instruct
BAAI/Emu3-Gen
deepseek-ai/DeepSeek-V2.5-1210, deepseek-ai/deepseek-vl2等系列模型
Shanghai_AI_Laboratory/internlm-xcomposer2d5-ol-7b
InfiniAI/Megrez-3b-Instruct, InfiniAI/Megrez-3B-Omni
TeleAI/TeleChat2-3B等系列模型

English Version

Architecture Modifications and New Features:

For more details, please visit: https://swift.readthedocs.io/en/latest/Instruction/ReleaseNote3.0.html

New Models:

OpenGVLab/InternVL2_5-1B series models
LLM-Research/Llama-3.3-70B-Instruct
BAAI/Emu3-Gen
deepseek-ai/DeepSeek-V2.5-1210, deepseek-ai/deepseek-vl2 series models
Shanghai_AI_Laboratory/internlm-xcomposer2d5-ol-7b
InfiniAI/Megrez-3b-Instruct, InfiniAI/Megrez-3B-Omni
TeleAI/TeleChat2-3B series models

What's Changed

Refactor All Codes and bump version to 3.0 by @tastelikefeet in #2030
fix doc by @tastelikefeet in #2545
fix manifest by @tastelikefeet in #2546
add doc 2.x by @tastelikefeet in #2548
fix ui by @tastelikefeet in #2549
fix infer by @tastelikefeet in #2550
Refactor mllm by @Jintao-Huang in #2543
fix ui by @tastelikefeet in #2552
Fix ui by @tastelikefeet in #2556
Update ddp infer doc by @Jintao-Huang in #2557
fix docs by @Jintao-Huang in #2558
Fix docs by @Jintao-Huang in #2561
fix log by @tastelikefeet in #2564
Fix the command line parameter doc by @Jintao-Huang in #2565
fix context by @Jintao-Huang in #2568
Documents Updates by @yrk111222 in #2574
Revert "Documents Updates" by @Jintao-Huang in #2576
fix hub param by @tastelikefeet in #2572
Fix bugs by @Jintao-Huang in #2573
Support internvl2.5 by @Jintao-Huang in #2575
update english docs by @Jintao-Huang in #2577
fix en docs by @Jintao-Huang in #2580
fix docs & add custom example by @Jintao-Huang in #2581
fix custom example by @Jintao-Huang in #2582
support llama3.3 by @Jintao-Huang in #2584
update acc_strategy & fix citest by @Jintao-Huang in #2583
Support peft0.14 by @tastelikefeet in #2587
update infer/deploy examples by @Jintao-Huang in #2588
add image images mapping by @Jintao-Huang in #2594
update llm sft notebook by @Jintao-Huang in #2599
fix notebook by @Jintao-Huang in #2600
Fix streaming by @Jintao-Huang in #2601
Emu3 gen train by @mi804 in #2602
compat mllm notebook by @Jintao-Huang in #2604
Temporarily remove torchacc. by @Jintao-Huang in #2606
update docs by @Jintao-Huang in #2607
train and infer scripts for emu3_gen by @mi804 in #2610
Uodate Document by @yrk111222 in #2615
update memory usage of emu3-gen by @mi804 in #2611
move prepare_model by @Jintao-Huang in #2614
Update mllm notebook by @Jintao-Huang in #2617
Support all-embedding / all-norm by @Jintao-Huang in #2619
fix lmdeploy==0.5.* by @Jintao-Huang in #2621
Support deepseek-ai/DeepSeek-V2.5-1210 by @Jintao-Huang in #2624
fix use_reentrant gradient_checkpointing by @Jintao-Huang in #2625
support reward model by @Jintao-Huang in #2628
fix add_default_tag by @Jintao-Huang in #2631
fix dataset by @Jintao-Huang in #2636
fix bugs & update openbuddy models & update docs by @Jintao-Huang in #2638
fix app-ui by @tastelikefeet in #2641
Fix post encode by @Jintao-Huang in #2643
fix bugs by @Jintao-Huang in #2645
update truncation_strategy by @Jintao-Huang in #2647
fix swift/Infinity-Instruct by @Jintao-Huang in #2651
Support LoRA-GA by @lxline in #2650
support deepseek_vl2 by @Jintao-Huang in #2654
fix swift/SlimOrca by @Jintao-Huang in #2656
fix swift/SlimOrca by @Jintao-Huang in #2657
support Shanghai_AI_Laboratory/internlm-xcomposer2d5-ol-7b:audio by @Jintao-Huang in #2658
support Shanghai_AI_Laboratory/internlm-xcomposer2d5-ol-7b:base by @Jintao-Huang in #2660
fix hub by @tastelikefeet in #2661
fix liger by @tastelikefeet in #2666
support megrez by @Jintao-Huang in #2667
fix unsloth resume training by @tastelikefeet in #2668
fix dataset by @Jintao-Huang in #2670
Fix bugs by @tastelikefeet in #2671
fix deepseek_vl2 by @Jintao-Huang in #2675
support adapters by @Jintao-Huang in #2633
Support megrez omni by @Jintao-Huang in #2674
fix docs by @Jintao-Huang in #2679
fix megrez_omni by @Jintao-Huang in #2680
fix infer by @Jintao-Huang in #2681
Fix bugs by @Jintao-Huang in #2687
Update readme by @Jintao-Huang in #2579
update wechat by @Jintao-Huang in #2694
fix readme by @Jintao-Huang in #2696
Fix web-ui by @tastelikefeet in #2693
Fix readme by @Jintao-Huang in #2697
Update banner by @Jintao-Huang in #2699
fix use_reentrant by @Jintao-Huang in #2700
update examples by @Jintao-Huang in #2703
fix eval strategy by @Jintao-Huang in #2707
Update FAQ by @slin000111 in #2706
qwen to Qwen by @Jintao-Huang in #2708
fix timeout & web-ui by @Jintao-Huang in #2709
Fix multi lora by @tastelikefeet in #2711
support Qwen/QVQ-72B-Preview by @Jintao-Huang in #2712
update examples by @Jintao-Huang in #2714
fix deploy request_config by @Jintao-Huang in #2718
fix examples by @Jintao-Huang in #2719
fix gptq group_size by @Jintao-Huang in #2720
Better error messages by @Jintao-Huang in #2721

New Contributors

@yrk111222 made their first contribution in #2574
@lxline made their first contribution in #2650

Full Changelog: v2.6.1...v3.0.0

Contributors

mi804, Jintao-Huang, and 4 other contributors

Assets 2

29 Nov 09:29

Jintao-Huang

v2.6.1

ab38bff

v2.6.1

New Models:

New Datasets:

OpenO1-SFT

What's Changed

support part tuner replace_key False by @tastelikefeet in #2438
bump ms version by @tastelikefeet in #2449
remove useless code by @tastelikefeet in #2453
fix qwen2-vl position_ids by @Jintao-Huang in #2461
fix peft is_multimodal by @Jintao-Huang in #2462
fix qwen2vl pt infer by @Jintao-Huang in #2463
[TorchAcc] Update padding strategy when using persistent cache by @eedalong in #2464
fix kto by @Jintao-Huang in #2478
Update Common QA by @slin000111 in #2475
fix awq quant device_map by @Jintao-Huang in #2488
Fix preprocess num proc by @Jintao-Huang in #2492
Support marco o1 by @Jintao-Huang in #2496
fix eval_dataset no by @Jintao-Huang in #2497
support batch flattening collator by @eedalong in #2499
fix latex-ocr by @Jintao-Huang in #2510
support mPLUG-Owl3 241101 by @LukeForeverYoung in #2515
support qwq by @Jintao-Huang in #2520
support glm-edge & glm-edge-v by @Jintao-Huang in #2526

New Contributors

@eedalong made their first contribution in #2464

Full Changelog: v2.6.0...v2.6.1

Contributors

LukeForeverYoung, eedalong, and 3 other contributors

Assets 2

Releases: modelscope/ms-swift

v3.2.2

中文版

新特性

新模型

English Version

New Features

New Models

What's Changed

Contributors

Uh oh!

v3.2.1

中文版

新特性

新模型

New Features

New Models

What's Changed

New Contributors

Contributors

Uh oh!

v3.2.0

中文版

新特性

新模型

新数据集

New Features

New Models

New Datasets

What's Changed

New Contributors

Contributors

Uh oh!

v3.1.1

中文版

新特性

新模型

新数据集

New Features

New Models

New Datasets

What's Changed

New Contributors

Contributors

Uh oh!

v3.1.0

中文版

新特性

新模型

新数据集

English Version

New Features

New Models

New Datasets

What's Changed

New Contributors

Contributors

Uh oh!

v3.0.3

中文版

新特性

新模型

English Version

New Features

New Models

What's Changed

Contributors

Uh oh!

v3.0.2

中文版

新特性

新模型

新数据集

English Version

New Features

New Models

New Datasets

What's Changed

New Contributors

Contributors