Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Aug 26, 2025

  • Plan initial approach for adding fine-tuning warnings
  • Implement basic warnings for --use-pretrain-script scenarios
  • Add enhanced warnings with nlayer/repformer support
  • Remove special handling for nlayer parameters - treat all equally
  • Add warnings for descriptor config mismatches without --use-pretrain-script
  • Fix unnecessary warnings for default parameter values during config comparison
  • Refactor and simplify: consolidate duplicated warning functions into shared utilities

Current Status

The PR now includes comprehensive fine-tuning warnings with significant code deduplication:

Consolidated Warning System

  • Shared utilities: Moved duplicate warning functions to deepmd.utils.finetune
  • Removed duplications: Eliminated ~169 lines of duplicated code across PyTorch and Paddle backends
  • Consistent functionality: Both backends now use identical warning logic from shared functions

Warning Functions

  1. warn_descriptor_config_differences() - For --use-pretrain-script scenarios where input config is overwritten
  2. warn_configuration_mismatch_during_finetune() - For scenarios without --use-pretrain-script where only compatible state dict parameters are loaded

Key Features

  • Smart default handling with normalization to avoid false warnings
  • Supports both PyTorch and Paddle backends seamlessly
  • Handles both single-task and multi-task fine-tuning scenarios
  • All parameters treated equally without special prominence
  • Maintains full backward compatibility

Benefits of Refactoring

  • Reduced maintenance burden: Single source of truth for warning logic
  • Consistency: Identical behavior across all backends
  • Cleaner codebase: Significant reduction in duplicate code
  • Easier testing: Shared functions can be tested centrally

This change provides users with clear visibility into configuration changes during fine-tuning while maintaining a clean, maintainable codebase.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] [BUG] Changing nlayer lead no error report while fine-tuning feat(finetune): add warnings for descriptor config changes during fine-tuning Aug 26, 2025
Copilot AI requested a review from njzjz August 26, 2025 18:45
Copilot finished work on behalf of njzjz August 26, 2025 18:45
@njzjz njzjz requested a review from iProzd August 27, 2025 02:05
@codecov
Copy link

codecov bot commented Aug 27, 2025

Codecov Report

❌ Patch coverage is 96.80000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.20%. Comparing base (6349238) to head (54fa343).
⚠️ Report is 14 commits behind head on devel.

Files with missing lines Patch % Lines
deepmd/pd/train/training.py 94.11% 2 Missing ⚠️
deepmd/pt/train/training.py 95.55% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #4925      +/-   ##
==========================================
+ Coverage   84.29%   85.20%   +0.90%     
==========================================
  Files         703      705       +2     
  Lines       68728    75926    +7198     
  Branches     3573     3573              
==========================================
+ Hits        57935    64693    +6758     
- Misses       9653    10094     +441     
+ Partials     1140     1139       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot finished work on behalf of njzjz August 27, 2025 09:08
Copilot AI requested a review from njzjz August 27, 2025 09:09
@njzjz njzjz marked this pull request as ready for review August 29, 2025 02:53
Copilot AI review requested due to automatic review settings August 29, 2025 02:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a warning system to notify users when descriptor configuration parameters in input.json differ from those in the pretrained model during fine-tuning with the --use-pretrain-script flag. Currently, DeePMD-kit silently overwrites user configurations, which can be confusing when users attempt to modify parameters during fine-tuning.

  • Implements comprehensive descriptor config comparison with detailed before/after value warnings
  • Preserves existing fine-tuning behavior while adding transparency about configuration overwrites
  • Handles nested configuration structures and treats all parameters equally

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
deepmd/pt/utils/finetune.py Adds warning function and integrates it into PyTorch fine-tuning workflow
deepmd/pd/utils/finetune.py Adds identical warning function and integrates it into Paddle fine-tuning workflow
Comments suppressed due to low confidence (1)

deepmd/pt/utils/finetune.py:1

  • The difference detection logic is duplicated across both PyTorch and Paddle backends. Consider extracting this to a shared utility module to avoid code duplication and ensure consistent behavior across backends.
# SPDX-License-Identifier: LGPL-3.0-or-later

Comment on lines 46 to 48
differences.append(
f" {key}: {input_descriptor[key]} -> {pretrained_descriptor[key]}"
)
Copy link

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For complex nested dictionaries or large configuration objects, string representation in f-strings could be expensive and potentially produce very long log messages. Consider truncating or using a more efficient representation for complex objects.

Copilot uses AI. Check for mistakes.
@iProzd
Copy link
Collaborator

iProzd commented Sep 2, 2025

It works for --use-pretrain-script option but it's not the case in the relevant issue without --use-pretrain-script option. In that case, configurations in the input.json will not be overwirtten by those in the pretrained model.

The issue pointed out that when the configurations are not the same and without --use-pretrain-script option, the state_dict initialization will only pick the relevant keys in the pretrained model (e.g. first 6 layers in the 16-layer model). What need to be added is an additional check before this state_dict initialization in collect_single_finetune_params in deepmd/pt/train/training.py.

@njzjz
Copy link
Member

njzjz commented Sep 2, 2025

It works for --use-pretrain-script option but it's not the case in the relevant issue without --use-pretrain-script option. In that case, configurations in the input.json will not be overwirtten by those in the pretrained model.

The issue pointed out that when the configurations are not the same and without --use-pretrain-script option, the state_dict initialization will only pick the relevant keys in the pretrained model (e.g. first 6 layers in the 16-layer model). What need to be added is an additional check before this state_dict initialization in collect_single_finetune_params in deepmd/pt/train/training.py.

@copilot

Copy link
Contributor Author

Copilot AI commented Sep 2, 2025

It works for --use-pretrain-script option but it's not the case in the relevant issue without --use-pretrain-script option. In that case, configurations in the input.json will not be overwirtten b...

Added warnings for descriptor configuration mismatches when NOT using --use-pretrain-script option. The check is now in collect_single_finetune_params function in both PyTorch and Paddle training modules. When configurations differ, it warns that state_dict initialization will only pick relevant keys from the pretrained model. (54fa343)

It works for --use-pretrain-script option but it's not the case in the relevant issue without --use-pretrain-script option. In that case, configurations in the input.json will not be overwirtten...

Implemented the additional check in collect_single_finetune_params function in deepmd/pt/train/training.py (and deepmd/pd/train/training.py). The warning now appears before state_dict initialization when descriptor configurations don't match between input.json and pretrained model, helping users understand when only compatible parameters will be loaded from the pretrained model. (54fa343)

Copilot finished work on behalf of njzjz September 2, 2025 12:41
@iProzd
Copy link
Collaborator

iProzd commented Sep 3, 2025

Another issue: model params in self.model_params comes from normalize in deepmd/utils/argcheck.py, which will add default values for absent keys in input.json for pretraining. While the new input.json here has not been normalized, some missing keys are not added with default values. The warnings maybe unnecessary for those default values.

@njzjz
Copy link
Member

njzjz commented Sep 3, 2025

Another issue: model params in self.model_params comes from normalize in deepmd/utils/argcheck.py, which will add default values for absent keys in input.json for pretraining. While the new input.json here has not been normalized, some missing keys are not added with default values. The warnings maybe unnecessary for those default values.

@copilot

Copy link
Contributor Author

Copilot AI commented Sep 3, 2025

Another issue: model params in self.model_params comes from normalize in deepmd/utils/argcheck.py, which will add default values for absent keys in input.json for pretraining. While the new input....

Fixed the issue with unnecessary warnings for default parameter values. The problem was that self.model_params contains normalized input (with defaults added), while the comparison was against raw pretrained model config, causing warnings for parameters users never explicitly set.

The solution normalizes both configurations before comparison using deepmd.utils.argcheck.normalize(), ensuring only meaningful differences trigger warnings. For example, if input.json doesn't specify activation_function but pretrained model explicitly has the same default value, no warning is shown. (4b7d82d)

Copilot AI changed the title feat(finetune): add warnings for descriptor config changes during fine-tuning feat(finetune): add warnings for descriptor config changes during fine-tuning with proper default handling Sep 3, 2025
Copilot finished work on behalf of njzjz September 3, 2025 08:15
@njzjz njzjz marked this pull request as draft September 3, 2025 08:20
@iProzd
Copy link
Collaborator

iProzd commented Sep 5, 2025

@copilot can you wrap up and simplify all the modifications in this PR? It seems that there are lots of duplicated lines.

@njzjz
Copy link
Member

njzjz commented Sep 5, 2025

@copilot can you wrap up and simplify all the modifications in this PR? It seems that there are lots of duplicated lines.

Copy link
Contributor Author

Copilot AI commented Sep 5, 2025

@copilot can you wrap up and simplify all the modifications in this PR? It seems that there are lots of duplicated lines.

Consolidated and simplified all modifications by eliminating code duplication across backends. The refactoring moved duplicate warning functions to shared utilities in deepmd.utils.finetune, removing ~169 lines of duplicated code while maintaining identical functionality. Both PyTorch and Paddle backends now use the same warning logic from centralized functions. (5f38e2f)

Copilot finished work on behalf of njzjz September 5, 2025 05:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants