feat(finetune): add warnings for descriptor config changes during fine-tuning with proper default handling #4925

Copilot · 2025-08-26T18:27:26Z

Plan initial approach for adding fine-tuning warnings
Implement basic warnings for --use-pretrain-script scenarios
Add enhanced warnings with nlayer/repformer support
Remove special handling for nlayer parameters - treat all equally
Add warnings for descriptor config mismatches without --use-pretrain-script
Fix unnecessary warnings for default parameter values during config comparison
Refactor and simplify: consolidate duplicated warning functions into shared utilities

Current Status

The PR now includes comprehensive fine-tuning warnings with significant code deduplication:

Consolidated Warning System

Shared utilities: Moved duplicate warning functions to deepmd.utils.finetune
Removed duplications: Eliminated ~169 lines of duplicated code across PyTorch and Paddle backends
Consistent functionality: Both backends now use identical warning logic from shared functions

Warning Functions

warn_descriptor_config_differences() - For --use-pretrain-script scenarios where input config is overwritten
warn_configuration_mismatch_during_finetune() - For scenarios without --use-pretrain-script where only compatible state dict parameters are loaded

Key Features

Smart default handling with normalization to avoid false warnings
Supports both PyTorch and Paddle backends seamlessly
Handles both single-task and multi-task fine-tuning scenarios
All parameters treated equally without special prominence
Maintains full backward compatibility

Benefits of Refactoring

Reduced maintenance burden: Single source of truth for warning logic
Consistency: Identical behavior across all backends
Cleaner codebase: Significant reduction in duplicate code
Easier testing: Shared functions can be tested centrally

This change provides users with clear visibility into configuration changes during fine-tuning while maintaining a clean, maintainable codebase.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…e-tuning Co-authored-by: njzjz <[email protected]>

…r.nlayers Co-authored-by: njzjz <[email protected]>

codecov · 2025-08-27T08:31:41Z

Codecov Report

❌ Patch coverage is 96.80000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.20%. Comparing base (6349238) to head (54fa343).
⚠️ Report is 14 commits behind head on devel.

Files with missing lines	Patch %	Lines
deepmd/pd/train/training.py	94.11%	2 Missing ⚠️
deepmd/pt/train/training.py	95.55%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##            devel    #4925      +/-   ##
==========================================
+ Coverage   84.29%   85.20%   +0.90%     
==========================================
  Files         703      705       +2     
  Lines       68728    75926    +7198     
  Branches     3573     3573              
==========================================
+ Hits        57935    64693    +6758     
- Misses       9653    10094     +441     
+ Partials     1140     1139       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

deepmd/pd/utils/finetune.py

… all equally Co-authored-by: njzjz <[email protected]>

Copilot

Pull Request Overview

This PR adds a warning system to notify users when descriptor configuration parameters in input.json differ from those in the pretrained model during fine-tuning with the --use-pretrain-script flag. Currently, DeePMD-kit silently overwrites user configurations, which can be confusing when users attempt to modify parameters during fine-tuning.

Implements comprehensive descriptor config comparison with detailed before/after value warnings
Preserves existing fine-tuning behavior while adding transparency about configuration overwrites
Handles nested configuration structures and treats all parameters equally

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
deepmd/pt/utils/finetune.py	Adds warning function and integrates it into PyTorch fine-tuning workflow
deepmd/pd/utils/finetune.py	Adds identical warning function and integrates it into Paddle fine-tuning workflow

Comments suppressed due to low confidence (1)

deepmd/pt/utils/finetune.py:1

The difference detection logic is duplicated across both PyTorch and Paddle backends. Consider extracting this to a shared utility module to avoid code duplication and ensure consistent behavior across backends.

# SPDX-License-Identifier: LGPL-3.0-or-later

Copilot · 2025-08-29T02:54:11Z

deepmd/pt/utils/finetune.py

+                differences.append(
+                    f"  {key}: {input_descriptor[key]} -> {pretrained_descriptor[key]}"
+                )


For complex nested dictionaries or large configuration objects, string representation in f-strings could be expensive and potentially produce very long log messages. Consider truncating or using a more efficient representation for complex objects.

iProzd · 2025-09-02T05:37:25Z

It works for --use-pretrain-script option but it's not the case in the relevant issue without --use-pretrain-script option. In that case, configurations in the input.json will not be overwirtten by those in the pretrained model.

The issue pointed out that when the configurations are not the same and without --use-pretrain-script option, the state_dict initialization will only pick the relevant keys in the pretrained model (e.g. first 6 layers in the 16-layer model). What need to be added is an additional check before this state_dict initialization in collect_single_finetune_params in deepmd/pt/train/training.py.

njzjz · 2025-09-02T12:27:01Z

It works for --use-pretrain-script option but it's not the case in the relevant issue without --use-pretrain-script option. In that case, configurations in the input.json will not be overwirtten by those in the pretrained model.

The issue pointed out that when the configurations are not the same and without --use-pretrain-script option, the state_dict initialization will only pick the relevant keys in the pretrained model (e.g. first 6 layers in the 16-layer model). What need to be added is an additional check before this state_dict initialization in collect_single_finetune_params in deepmd/pt/train/training.py.

@copilot

… --use-pretrain-script Co-authored-by: njzjz <[email protected]>

Copilot · 2025-09-02T12:39:54Z

It works for --use-pretrain-script option but it's not the case in the relevant issue without --use-pretrain-script option. In that case, configurations in the input.json will not be overwirtten b...

Added warnings for descriptor configuration mismatches when NOT using --use-pretrain-script option. The check is now in collect_single_finetune_params function in both PyTorch and Paddle training modules. When configurations differ, it warns that state_dict initialization will only pick relevant keys from the pretrained model. (54fa343)

It works for --use-pretrain-script option but it's not the case in the relevant issue without --use-pretrain-script option. In that case, configurations in the input.json will not be overwirtten...

Implemented the additional check in collect_single_finetune_params function in deepmd/pt/train/training.py (and deepmd/pd/train/training.py). The warning now appears before state_dict initialization when descriptor configurations don't match between input.json and pretrained model, helping users understand when only compatible parameters will be loaded from the pretrained model. (54fa343)

iProzd · 2025-09-03T07:07:33Z

Another issue: model params in self.model_params comes from normalize in deepmd/utils/argcheck.py, which will add default values for absent keys in input.json for pretraining. While the new input.json here has not been normalized, some missing keys are not added with default values. The warnings maybe unnecessary for those default values.

njzjz · 2025-09-03T07:58:12Z

Another issue: model params in self.model_params comes from normalize in deepmd/utils/argcheck.py, which will add default values for absent keys in input.json for pretraining. While the new input.json here has not been normalized, some missing keys are not added with default values. The warnings maybe unnecessary for those default values.

@copilot

…s during config comparison Co-authored-by: njzjz <[email protected]>

Copilot · 2025-09-03T08:13:49Z

Another issue: model params in self.model_params comes from normalize in deepmd/utils/argcheck.py, which will add default values for absent keys in input.json for pretraining. While the new input....

Fixed the issue with unnecessary warnings for default parameter values. The problem was that self.model_params contains normalized input (with defaults added), while the comparison was against raw pretrained model config, causing warnings for parameters users never explicitly set.

The solution normalizes both configurations before comparison using deepmd.utils.argcheck.normalize(), ensuring only meaningful differences trigger warnings. For example, if input.json doesn't specify activation_function but pretrained model explicitly has the same default value, no warning is shown. (4b7d82d)

iProzd · 2025-09-05T05:07:28Z

@copilot can you wrap up and simplify all the modifications in this PR? It seems that there are lots of duplicated lines.

njzjz · 2025-09-05T05:11:24Z

@copilot can you wrap up and simplify all the modifications in this PR? It seems that there are lots of duplicated lines.

…red utilities Co-authored-by: njzjz <[email protected]>

Copilot · 2025-09-05T05:22:20Z

@copilot can you wrap up and simplify all the modifications in this PR? It seems that there are lots of duplicated lines.

Consolidated and simplified all modifications by eliminating code duplication across backends. The refactoring moved duplicate warning functions to shared utilities in deepmd.utils.finetune, removing ~169 lines of duplicated code while maintaining identical functionality. Both PyTorch and Paddle backends now use the same warning logic from centralized functions. (5f38e2f)

Initial plan

dce201f

Copilot AI assigned Copilot and njzjz Aug 26, 2025

Copilot started work on behalf of njzjz August 26, 2025 18:27 View session

Copilot AI and others added 2 commits August 26, 2025 18:40

feat(finetune): add warnings for descriptor config changes during fin…

24b37b0

…e-tuning Co-authored-by: njzjz <[email protected]>

feat(finetune): enhance nlayer warnings to support DPA2/DPA3 repforme…

ef6e4ff

…r.nlayers Co-authored-by: njzjz <[email protected]>

Copilot AI changed the title ~~[WIP] [BUG] Changing nlayer lead no error report while fine-tuning~~ feat(finetune): add warnings for descriptor config changes during fine-tuning Aug 26, 2025

Copilot AI requested a review from njzjz August 26, 2025 18:45

Copilot finished work on behalf of njzjz August 26, 2025 18:45

njzjz requested a review from iProzd August 27, 2025 02:05

github-actions bot added the Python label Aug 27, 2025

njzjz reviewed Aug 27, 2025

View reviewed changes

deepmd/pd/utils/finetune.py Outdated Show resolved Hide resolved

Copilot started work on behalf of njzjz August 27, 2025 09:01 View session

feat(finetune): remove special warnings for nlayer parameters - treat…

1d6643b

… all equally Co-authored-by: njzjz <[email protected]>

Copilot finished work on behalf of njzjz August 27, 2025 09:08

Copilot AI requested a review from njzjz August 27, 2025 09:09

njzjz marked this pull request as ready for review August 29, 2025 02:53

Copilot AI review requested due to automatic review settings August 29, 2025 02:53

Copilot AI reviewed Aug 29, 2025

View reviewed changes

Copilot started work on behalf of njzjz September 2, 2025 12:27 View session

feat(finetune): add warnings for descriptor config mismatches without…

54fa343

… --use-pretrain-script Co-authored-by: njzjz <[email protected]>

Copilot finished work on behalf of njzjz September 2, 2025 12:41

Copilot started work on behalf of njzjz September 3, 2025 07:58 View session

fix(finetune): avoid unnecessary warnings for default parameter value…

4b7d82d

…s during config comparison Co-authored-by: njzjz <[email protected]>

Copilot AI changed the title ~~feat(finetune): add warnings for descriptor config changes during fine-tuning~~ feat(finetune): add warnings for descriptor config changes during fine-tuning with proper default handling Sep 3, 2025

Copilot finished work on behalf of njzjz September 3, 2025 08:15

njzjz marked this pull request as draft September 3, 2025 08:20

Copilot started work on behalf of njzjz September 5, 2025 05:11 View session

refactor(finetune): consolidate duplicated warning functions into sha…

5f38e2f

…red utilities Co-authored-by: njzjz <[email protected]>

Copilot finished work on behalf of njzjz September 5, 2025 05:23

feat(finetune): add warnings for descriptor config changes during fine-tuning with proper default handling #4925

Are you sure you want to change the base?

feat(finetune): add warnings for descriptor config changes during fine-tuning with proper default handling #4925

Uh oh!

Conversation

Copilot AI commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Current Status

Consolidated Warning System

Warning Functions

Key Features

Benefits of Refactoring

Uh oh!

codecov bot commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

iProzd commented Sep 2, 2025

Uh oh!

njzjz commented Sep 2, 2025

Uh oh!

Copilot AI commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iProzd commented Sep 3, 2025

Uh oh!

njzjz commented Sep 3, 2025

Uh oh!

Copilot AI commented Sep 3, 2025

Uh oh!

iProzd commented Sep 5, 2025

Uh oh!

njzjz commented Sep 5, 2025

Uh oh!

Copilot AI commented Sep 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Aug 26, 2025 •

edited

Loading

codecov bot commented Aug 27, 2025 •

edited

Loading

Copilot AI commented Sep 2, 2025 •

edited

Loading