[Cherry-Pick][Refactor] Replace --skip-mm-profiling with --deploy-modality text(#7048) by EmmonsCurse · Pull Request #7068 · PaddlePaddle/FastDeploy

EmmonsCurse · 2026-03-30T02:40:44Z

Cherry-pick of #7048 (authored by @kevincheng2) to release/2.5.

Motivation

原 --skip-mm-profiling 参数与已有的 --deploy-modality 参数功能存在语义重叠：
当以纯文本模式（--deploy-modality text）部署时，本就不需要为多模态 token 预留显存。
引入独立参数增加了配置复杂度，复用 deploy_modality 更加直观和一致。

Modifications

fastdeploy/engine/args_utils.py：删除 EngineArgs.skip_mm_profiling 字段及
--skip-mm-profiling 启动参数
fastdeploy/config.py：删除 ModelConfig.__init__ 中的 self.skip_mm_profiling = False；
FDConfig.get_max_chunk_tokens 中将条件改为
self.deploy_modality != DeployModality.TEXT，
当 deploy_modality 为 text 时直接返回 max_num_batched_tokens，跳过 mm token 叠加

Usage or Command

# 以文本模式部署，跳过 mm token profiling 开销（替代原 --skip-mm-profiling）
python -m fastdeploy.entrypoints.openai.api_server \
--deploy-modality text \
--model /path/to/model \
...

Checklist

Add at least a tag in the PR title.
Format your code, run pre-commit before commit.
Add unit tests. 本次为参数重构，逻辑等价替换，已有 config 单元测试覆盖。

…addlePaddle#7048) * [Feature] Support --skip-mm-profiling to skip multimodal token overhead in profiling ## Motivation 在多模态模型（如 Qwen2.5-VL、ERNIE4.5-VL 等）部署时，`get_max_chunk_tokens` 会在基础 token 数之上额外叠加 mm token 数，用于 profiling 阶段预留显存。某些场景下（如已知图像 token 数较小，或希望节省显存），用户希望跳过该多模态 token 额外开销的计算，直接使用文本 token 数进行 profiling。 ## Modifications - `fastdeploy/engine/args_utils.py`：`EngineArgs` 新增 `skip_mm_profiling: bool = False` 字段，parser 新增 `--skip-mm-profiling` 启动参数 - `fastdeploy/config.py`：`ModelConfig.__init__` 新增 `self.skip_mm_profiling = False`； `FDConfig.get_max_chunk_tokens` 中增加 `not self.model_config.skip_mm_profiling` 判断，开启后跳过 mm token 叠加，直接返回基础 `num_tokens` ## Usage or Command 启动服务时添加参数： ```bash --skip-mm-profiling ``` ## Checklist - [x] Add at least a tag in the PR title. - [x] Format your code, run `pre-commit` before commit. - [ ] Add unit tests. 本功能为配置参数透传，逻辑简单，已有相关 config 单元测试覆盖。 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [Refactor] Replace skip_mm_profiling with deploy_modality=text to skip mm profiling ## Motivation 原 `--skip-mm-profiling` 参数与已有的 `deploy_modality` 参数功能存在语义重叠：当以纯文本模式（`deploy_modality=text`）部署时，本就不需要为多模态 token 预留显存。引入独立参数增加了配置复杂度，复用 `deploy_modality` 更加直观和一致。 ## Modifications - `fastdeploy/engine/args_utils.py`：删除 `EngineArgs.skip_mm_profiling` 字段及 `--skip-mm-profiling` 启动参数 - `fastdeploy/config.py`：删除 `ModelConfig.__init__` 中的 `self.skip_mm_profiling = False`； `FDConfig.get_max_chunk_tokens` 中将条件改为 `self.deploy_modality != DeployModality.TEXT`，当 deploy_modality 为 text 时直接返回 `max_num_batched_tokens`，跳过 mm token 叠加 ## Usage or Command ```bash # 以文本模式部署，跳过 mm token profiling 开销（替代原 --skip-mm-profiling） python -m fastdeploy.entrypoints.openai.api_server \ --deploy-modality text \ --model /path/to/model \ ... ``` ## Checklist - [x] Add at least a tag in the PR title. - [x] Format your code, run `pre-commit` before commit. - [ ] Add unit tests. 本次为参数重构，逻辑等价替换，已有 config 单元测试覆盖。 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

paddle-bot · 2026-03-30T02:40:53Z

Thanks for your contribution!

EmmonsCurse mentioned this pull request Mar 30, 2026

[Refactor] Replace --skip-mm-profiling with --deploy-modality text #7048

Merged

3 tasks

EmmonsCurse had a problem deploying to Metax_ci March 30, 2026 02:40 — with GitHub Actions Failure

EmmonsCurse closed this Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Cherry-Pick][Refactor] Replace --skip-mm-profiling with --deploy-modality text(#7048)#7068

[Cherry-Pick][Refactor] Replace --skip-mm-profiling with --deploy-modality text(#7048)#7068
EmmonsCurse wants to merge 1 commit intoPaddlePaddle:release/2.5from
EmmonsCurse:cherry-pick/7048/release/2.5

EmmonsCurse commented Mar 30, 2026

Uh oh!

paddle-bot bot commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

EmmonsCurse commented Mar 30, 2026

Motivation

Modifications

Usage or Command

Checklist

Uh oh!

paddle-bot bot commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants