Releases: deepmodeling/deepmd-kit
v3.0.0rc0
DeePMD-kit v3: Multiple-backend Framework, DPA-2 Large Atomic Model, and Plugin Mechanisms
We are excited to present the first release candidate of DeePMD-kit v3, an advanced version that enables deep potential models with TensorFlow, PyTorch, or JAX backends. Additionally, DeePMD-kit v3 introduces support for the DPA-2 model, a novel architecture optimized for large atomic models. This release enhances plugin mechanisms, making integrating and developing new models easier.
Highlights
Multiple-backend framework: TensorFlow, PyTorch, and JAX support
DeePMD-kit v3 adds a versatile, pluggable framework providing consistent training and inference experience across multiple backends. Version 3.0.0 includes:
- TensorFlow backend: Known for its computational efficiency with a static graph design.
- PyTorch backend: A dynamic graph backend that simplifies model extension and development.
- DP backend: Built with NumPy and Array API, a reference backend for development without heavy deep-learning frameworks.
- JAX backend: Based on the DP backend via Array API, a static graph backend.
Features | TensorFlow | PyTorch | JAX | DP |
---|---|---|---|---|
Descriptor local frame | ✅ | |||
Descriptor se_e2_a | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e2_r | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e3 | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e3_tebd | ✅ | ✅ | ✅ | |
Descriptor DPA1 | ✅ | ✅ | ✅ | ✅ |
Descriptor DPA2 | ✅ | ✅ | ✅ | |
Descriptor Hybrid | ✅ | ✅ | ✅ | ✅ |
Fitting energy | ✅ | ✅ | ✅ | ✅ |
Fitting dipole | ✅ | ✅ | ✅ | ✅ |
Fitting polar | ✅ | ✅ | ✅ | ✅ |
Fitting DOS | ✅ | ✅ | ✅ | ✅ |
Fitting property | ✅ | ✅ | ✅ | |
ZBL | ✅ | ✅ | ✅ | ✅ |
DPLR | ✅ | |||
DPRc | ✅ | ✅ | ✅ | ✅ |
Spin | ✅ | ✅ | ✅ | |
Gradient calculation | ✅ | ✅ | ✅ | |
Model training | ✅ | ✅ | ||
Model compression | ✅ | ✅ | ||
Python inference | ✅ | ✅ | ✅ | ✅ |
C++ inference | ✅ | ✅ | ✅ |
Critical features of the multiple-backend framework include the ability to:
- Train models using different backends with the same training data and input script, allowing backend switching based on your efficiency or convenience needs.
# Training a model using the TensorFlow backend
dp --tf train input.json
dp --tf freeze
dp --tf compress
# Training a model using the PyTorch backend
dp --pt train input.json
dp --pt freeze
dp --pt compress
- Convert models between backends using
dp convert-backend
, with backend-specific file extensions (e.g.,.pb
for TensorFlow and.pth
for PyTorch).
# Convert from a TensorFlow model to a PyTorch model
dp convert-backend frozen_model.pb frozen_model.pth
# Convert from a PyTorch model to a TensorFlow model
dp convert-backend frozen_model.pth frozen_model.pb
# Convert from a PyTorch model to a JAX model
dp convert-backend frozen_model.pth frozen_model.savedmodel
# Convert from a PyTorch model to the backend-independent DP format
dp convert-backend frozen_model.pth frozen_model.dp
- Run inference across backends via interfaces like
dp test
, Python/C++/C interfaces, or third-party packages (e.g., dpdata, ASE, LAMMPS, AMBER, Gromacs, i-PI, CP2K, OpenMM, ABACUS, etc.).
# In a LAMMPS file:
# run LAMMPS with a TensorFlow backend model
pair_style deepmd frozen_model.pb
# run LAMMPS with a PyTorch backend model
pair_style deepmd frozen_model.pth
# run LAMMPS with a JAX backend model
pair_style deepmd frozen_model.savedmodel
# Calculate model deviation using different models
pair_style deepmd frozen_model.pb frozen_model.pth frozen_model.savedmodel out_file md.out out_freq 100
- Add a new backend to DeePMD-kit much more quickly if you want to contribute to DeePMD-kit.
DPA-2 model: Towards a universal large atomic model for molecular and material simulation
The DPA-2 model offers a robust architecture for large atomic models (LAM), accurately representing diverse chemical systems for high-quality simulations. In this release, DPA-2 is trainable in the PyTorch backend, with an example configuration available in examples/water/dpa2
. DPA-2 is available for Python inference in the JAX backend.
The DPA-2 descriptor comprises repinit
and repformer
, as shown below.
The PyTorch backend supports training strategies for large atomic models, including:
- Parallel training: Train large atomic models on multiple GPUs for efficiency.
torchrun --nproc_per_node=4 --no-python dp --pt train input.json
- Multi-task training: For large atomic models trained across a broad range of data calculated on different DFT levels with shared descriptors. An example is given in
examples/water_multi_task/pytorch_example/input_torch.json
. - Finetune: Training a pre-train large atomic model on a smaller, task-specific dataset. The PyTorch backend has supported
--finetune
argument in thedp --pt train
command line.
Plugin mechanisms for external models
In v3.0.0, plugin capabilities allow you to develop models with TensorFlow, PyTorch, or JAX, leveraging DeePMD-kit's trainer, loss functions, and interfaces. A plugin example is deepmd-gnn, which supports training the MACE and NequIP models in the DeePMD-kit with the familiar commands.
dp --pt train mace.json
dp --pt freeze
dp --pt test -m frozen_model.pth -s ../data/
Other new features
- Descriptor se_e3_tebd. (#4066)
- Fitting the property (#3867).
- New training parameters:
max_ckpt_keep
(#3441),change_bias_after_training
(#3993), andstat_file
. - New command line interface:
dp change-bias
(#3993) anddp show
(#3796). - Support generating JSON schema for integration with VSCode (#3849).
- The latest LAMMPS version (stable_29Aug2024_update1) is supported. (#4088, #4179)
Breaking changes
- Python 3.7 and 3.8 supports are dropped. (#3185, #4185)
- We require all model files to have the correct filename extension for all interfaces so a corresponding backend can load them. TensorFlow model files must end with
.pb
extension. - Bias is removed by default from type embedding. (#3958)
- The spin model is refactored, and its usage in the LAMMPS module has been changed. (#3301, #4321)
- Multi-task training support is removed from the TensorFlow backend. (#3763)
- The
set_prefix
key is deprecated. (#3753) dp test
now uses all sets for training and test. In previous versions, only the last set is used as the test set in dp test. (#3862)- The Python module structure is fully refactored. The old
deepmd
module was moved todeepmd.tf
without other API changes, anddeepmd_utils
was moved todeepmd
without other API changes. (#3177, #3178) - Python class
DeepTensor
(includingDeepDiople
andDeepPolar
) now returns atomic tensor in the dimension ofnatoms
instead ofnsel_atoms
. (#3390) - C++ 11 support is dropped. (#4068)
For other changes, refer to Full Changelog: v2.2.11...v3.0.0rc0
Contributors
The PyTorch backend was developed in the dptech-corp/deepmd-pytorch repository, and then it was fully merged into the deepmd-kit repository in #3180. Contributors to the deepmd-pytorch repository:
- @20171130
- @CaRoLZhangxy
- @amcadmus
- @guolinke
- @iProzd
- @nahso
- @njzjz
- @qin2xue3jian4
- @shishaochen
- @zjgemi
Contributors to the deepmd-kit repository:
- @CaRoLZhangxy: #3162 #3287 #3337 #3375 #3379 #3434 #3436 #3612 #3613 #3614 #3656 #3657 #3740 #3780 #3917 #3919 #4209 #4237
- @Chengqian-Zhang: #3615 #3796 #3828 #3840 #3867 #3912 #4120 #4145 #4280
- @ChiahsinChu: #4246 #4248
- @Cloudac7: #4031
- @HydrogenSulfate: #4117
- @LiuGroupHNU: #3978
- @Mancn-Xu: #3567
- @Yi-FanLi: #3822 #4013 #4084 #4283
- @anyangml: #3192 #3210 #3212 #3248 #3266 #3281 #3296 #3309 #3314 #3321 #3327 #3338 #3351 #3362 #3376 #3385 #3398 #3410 #3426 #3432 #3435 #3447 #3451 #3452 #3468 #3485 #3486 #3575 #3584 #3654 #3662 #3663 #3706 #3757 #3759 #3812 #3824 #3876 #3946 #3975 #4194 #4205 #4292 #4335 #4339
- @caic99: #3465 #4165
- @chazeon: #3473 #3652 #3653 #3739
- @cherryWangY: #3877 #4227 #4297 #4298 #4299 #4300
- @dependabot: #3231 #3312 #3446 #3487 #3777 #3882 #4045 #4127
- @hztttt: #3762
- @iProzd: #3180 #3203 #3245 #3261 #3301 #3355 #3359 #3367 #3371 #3378 #3380 #3387 #3388 #3409 #3411 #3441 #3442 #3445 #3456 #3480 #3569 #3571 #3573 #3607 #3616 #3619 #3696 #3698 #3712 #3717 #3718 #3725 #3746 #3748 #3758 #3763 #3768 #3773 #3774 #3775 #3781 #3782 #3785 #3803 #3813 #3814 #3815 #3826 #3837 #3841 #3842 #3843 #3873 #3906 #3914 #3916 #3925 #3926 #3927 #3933 #3944 #3945 #3957 #3958 #3967 #3971 #3976 #3992 #3993 #4006 #4007 #4015 #4066 #4089 #4138 #4139 #4148 #4162 #4222 #4223 #4224 #4225 #4243 #4244 #4321 #4323 #4324 #4344 #4353 #4354
- @iid-ccme: #4340
- @nahso: #3726 #3727
- @njzjz: #3164 #3167 #3169 #3170 #3171 #3172 #3173 #3174 #3175 #3176 #3177 #3178 #3179 #3181 #3185 #3186 #3187 #3191 #3193 #3194 #3195 #3196 #3198 #3200 #3201 #3204 #3205 #3206 #3207 #3213 #3217 #3220 #3221 #3222 #3223 #3226 #3228 #3229 #3237 #3238 #3239 #3243 #3244 #3247 #3249 #3250 #325...
v3.0.0b4
What's Changed
Breaking changes
- breaking: drop C++ 11 by @njzjz in #4068
- breaking(pt/dp): tune new sub-structures for DPA2 by @iProzd in #4089
The default values of new optionsg1_out_conv
andg1_out_mlp
are set toTrue
. The behaviors in previous versions areFalse
.
New features
- feat pt : Support property fitting by @Chengqian-Zhang in #3867
- feat(pt/dp): support three-body type embedding by @iProzd in #4066
- feat: load customized OP library in the C++ interface by @njzjz in #4073
- feat: make
dp neighbor-stat --type-map
optional by @njzjz in #4049 - feat: directional nlist by @wanghan-iapcm in #4052
- feat(pt): support
eval_typeebd
forDeepEval
by @njzjz in #4110 - feat:
DeepEval.get_model_def_script
and commondp show
by @njzjz in #4131 - chore: support preset bias of atomic model output by @wanghan-iapcm in #4116
- feat(jax): support neural networks in #4156
Enhancement
- fix: bump LAMMPS to stable_29Aug2024 by @njzjz in #4088
- chore(pt): cleanup deadcode by @wanghan-iapcm in #4142
- chore(pt): make comm_dict for dpa2 noncompulsory when nghost is 0 by @njzjz in #4144
- Set ROCM_ROOT to ROCM_PATH when it exist by @sigbjobo in #4150
- chore(pt): move deepmd.pt.infer.deep_eval.eval_model to tests by @njzjz in #4153
Documentation
- docs: improve docs for environment variables by @njzjz in #4070
- docs: dynamically generate command outputs by @njzjz in #4071
- docs: improve error message for inconsistent type maps by @njzjz in #4074
- docs: add multiple packages to
intersphinx_mapping
by @njzjz in #4075 - docs: document CMake variables using Sphinx styles by @njzjz in #4079
- docs: update ipi installation command by @njzjz in #4081
- docs: fix the default value of
DP_ENABLE_PYTORCH
by @njzjz in #4083 - docs: fix defination of
se_e3
by @njzjz in #4113 - docs: update DeepModeling URLs by @njzjz-bot in #4119
- docs(pt): examples for new dpa2 model by @iProzd in #4138
Bugfix
- fix: fix PT AutoBatchSize OOM bug and merge execute_all into base by @njzjz in #4047
- fix: replace
datetime.datetime.utcnow
which is deprecated by @njzjz in #4067 - fix:fix LAMMPS MPI tests with mpi4py 4.0.0 by @njzjz in #4032
- fix(pt): invalid type_map when multitask training by @Cloudac7 in #4031
- fix: manage testing models in a standard way by @njzjz in #4028
- fix(pt): fix ValueError when array byte order is not native by @njzjz in #4100
- fix(pt): convert
torch.__version__
tostr
when serializing by @njzjz in #4106 - fix(tests): fix
skip_dp
by @njzjz in #4111 - [Fix] Wrap log_path with Path by @HydrogenSulfate in #4117
- fix: bugs in uts for property fit by @Chengqian-Zhang in #4120
- fix: type of the preset out bias by @wanghan-iapcm in #4135
- fix(pt): fix zero inputs for LayerNorm by @njzjz in #4134
- fix(pt/dp): share params of repinit_three_body by @iProzd in #4139
- fix(pt): move entry point from deepmd.pt.model to deepmd.pt by @njzjz in #4146
- fix: fix DPH5Path.glob for new keys by @njzjz in #4152
- fix(pt): make state_dict safe for weights_only by @iProzd in #4148
- fix(pt): fix compute_output_stats_global when atomic_output is None by @njzjz in #4155
- fix(pt ut): make separated uts deterministic by @iProzd in #4162
- fix(pt): finetuning property/dipole/polar/dos fitting with multi-dimensional data causes error by @Chengqian-Zhang in #4145
Dependency updates
- chore(deps): bump scikit-build-core to 0.9.x by @njzjz in #4038
- build(deps): bump pypa/cibuildwheel from 2.19 to 2.20 by @dependabot in #4045
- build(deps): bump pypa/cibuildwheel from 2.20 to 2.21 by @dependabot in #4127
CI/CD
- ci: add
include-hidden-files
toactions/upload-artifact
by @njzjz in #4095 - ci: test Python 3.12 by @njzjz in #4059
- CI(codecov): do not notify until all reports are ready by @njzjz in #4136
Full Changelog: v3.0.0b3...v3.0.0b4
v3.0.0b3
v3.0.0b2
What's Changed
New features
- feat: add documentation and options for multi-task arguments by @njzjz in #3989
- feat: plain text model format by @njzjz in #4025
- feat: allow model arguments to be registered outside by @njzjz in #3995
- feat: add
get_model
classmethod toBaseModel
by @njzjz in #4002
Enhancement
Documentation
- docs: document
PYTORCH_ROOT
by @njzjz in #3981 - docs: Disallow improper capitalization by @njzjz in #3982
- docs: pin sphinx-argparse to < 0.5.0 by @njzjz-bot in #3988
Bugfixes
- fix(cmake): fix
set_if_higher
by @njzjz in #3977 - fix(pt): ensure suffix of
--init_model
and--restart
is.pt
by @njzjz in #3980 - fix(pt): do not overwrite disp_file when restarting training by @njzjz in #3985
- fix(cc): compile
select_map<int>
when TensorFlow backend is off by @njzjz in #3987 - fix(pt): make 'find_' to be float in get data by @iProzd in #3992
- fix float precision problem of se_atten in line 217 (#3961) by @LiuGroupHNU in #3978
- fix: fix errors for zero atom inputs by @njzjz in #4005
- fix(pt): optimize graph memory usage by @iProzd in #4006
- fix(pt): fix lammps nlist sort with large sel by @iProzd in #3993
- fix(cc): add
atomic
argument toDeepPotBase::computew
by @njzjz in #3996 - fix(lmp): call model deviation interface without atomic properties when they are not requested by @njzjz in #4012
- fix(c): call C++ interface without atomic properties when they are not requested by @njzjz in #4010
- fix(pt): fix
get_dim
forDescrptDPA1Compat
by @iProzd in #4007 - fix(cc): fix message passing when nloc is 0 by @njzjz in #4021
- fix(pt): use user seed in
DpLoaderSet
by @iProzd in #4015
Code style
- style: require explicit device and dtype by @njzjz in #4001
- style: enable N804 and N805 by @njzjz in #4024
CI/CD
Full Changelog: v3.0.0b1...v3.0.0b2
v3.0.0b1
What's Changed
Breaking Changes
- breaking(pt/tf/dp): disable bias in type embedding by @iProzd in #3958
This change may make PyTorch checkpoints generated by v3.0.0b0 cannot be used in v3.0.0b1.
New features
- feat: add plugin entry point for PT by @njzjz in #3965
- feat(tf): improve the activation setting in tebd by @iProzd in #3971
Bugfix
- fix: remove ref-names from .git_archival.txt by @njzjz-bot in #3953
- fix(dp): fix dp seed in dpa2 descriptor by @iProzd in #3957
- fix(pt): add
finetune_head
to argcheck by @iProzd in #3967 - fix(cmake): fix USE_PT_PYTHON_LIBS by @njzjz in #3972
- fix(cmake): set C++ standard according to the PyTorch version by @njzjz in #3973
- Fix: tf dipole atomic key by @anyangml in #3975
- fix(pt/tf/dp): normalize the econf by @iProzd in #3976
CI/CD
Full Changelog: v3.0.0b0...v3.0.0b1
v3.0.0b0
What's Changed
Compared to v3.0.0a0, v3.0.0b0 contains all changes in v2.2.10 and v2.2.11, as well as:
Breaking changes
- breaking: remove multi-task support in tf by @iProzd in #3763
- breaking: deprecate
set_prefix
by @njzjz in #3753 - breaking: use all sets for training and test by @njzjz in #3862. In previous versions, only the last set is used as the test set in
dp test
. - PyTorch models trained in v3.0.0a0 cannot be used in v3.0.0b0 due to several changes. As mentioned in the release note of v3.0.0a0, we didn't promise backward compatibility for v3.0.0a0.
- The DPA-2 configurations have been changed by @iProzd in #3768. The old format in v3.0.0a0 is no longer supported.
Major new features
- Latest supported features in the PyTorch and DP backend, which are consistent with the TensorFlow backend if possible:
- Descriptor:
se_e2_a
,se_e2_r
,se_e3
,se_atten
,se_atten_v2
,dpa2
,hybrid
; - Fitting:
energy
,dipole
,polar
,dos
,fparam
/apram
support - Model: standard, DPRc,
frozen
, ZBL, Spin - Python inference interface
- PyTorch only: C++ inference interface for energy only
- PyTorch only: TensorBoard
- Descriptor:
- Support using the DPA-2 model in the LAMMPS by @CaRoLZhangxy in #3657. If you install the Python interface from the source, you must set the environment variable
DP_ENABLE_PYTORCH=1
to build the PyTorch customized OPs. - New command line options
dp show
by @Chengqian-Zhang in #3796 anddp change-bias
by @iProzd in #3933. - New training options
max_ckpt_keep
by @iProzd in #3441 andchange_bias_after_training
by @iProzd in #3933. Several training options now take effect in the PyTorch backend, such asseed
by @iProzd in #3773,disp_training
andtime_training
by @iProzd in #3775, andprofiling
by @njzjz in #3897. - Performance improvement of the PyTorch backend by @njzjz in #3422, #3424, #3425 and by @iProzd in #3826
- Support generating JSON schema for integration with VSCode by @njzjz in #3849
Minor enhancements and code refactoring are listed at v3.0.0a0...v3.0.0b0.
Contributors
- @CaRoLZhangxy: #3434, #3436, #3612, #3613, #3614, #3656, #3657, #3740, #3780, #3917, #3919
- @Chengqian-Zhang: #3615, #3796, #3828, #3840, #3912
- @Mancn-Xu: #3567
- @Yi-FanLi: #3822
- @anyangml: #3398, #3410, #3426, #3432, #3435, #3447, #3451, #3452, #3468, #3485, #3486, #3575, #3584, #3654, #3662, #3663, #3706, #3757, #3759, #3812, #3824, #3876
- @caic99: #3465
- @chazeon: #3473, #3652, #3653, #3739
- @cherryWangY: #3877
- @dependabot: #3446, #3487, #3777, #3882
- @hztttt: #3762
- @iProzd: #3301, #3409, #3411, #3441, #3442, #3445, #3456, #3480, #3569, #3571, #3573, #3607, #3616, #3619, #3696, #3698, #3712, #3717, #3718, #3725, #3746, #3748, #3758, #3763, #3768, #3773, #3774, #3775, #3781, #3782, #3785, #3803, #3813, #3814, #3815, #3826, #3837, #3841, #3842, #3843, #3873, #3906, #3914, #3916, #3925, #3926, #3927, #3933, #3944, #3945
- @nahso: #3726, #3727
- @njzjz: #3393, #3402, #3403, #3404, #3405, #3415, #3418, #3419, #3421, #3422, #3423, #3424, #3425, #3431, #3437, #3438, #3443, #3444, #3449, #3450, #3453, #3461, #3462, #3464, #3484, #3519, #3570, #3572, #3574, #3580, #3581, #3583, #3600, #3601, #3605, #3610, #3617, #3618, #3620, #3621, #3624, #3625, #3631, #3632, #3633, #3636, #3651, #3658, #3671, #3676, #3682, #3685, #3686, #3687, #3688, #3694, #3695, #3701, #3709, #3711, #3714, #3715, #3716, #3721, #3737, #3753, #3767, #3776, #3784, #3787, #3792, #3793, #3794, #3798, #3800, #3801, #3810, #3811, #3816, #3820, #3829, #3832, #3834, #3835, #3836, #3838, #3845, #3846, #3849, #3851, #3855, #3856, #3857, #3861, #3862, #3870, #3872, #3874, #3875, #3878, #3880, #3888, #3889, #3890, #3891, #3893, #3894, #3895, #3896, #3897, #3918, #3921, #3922, #3930
- @njzjz-bot: #3669
- @pre-commit-ci: #3454, #3489, #3599, #3634, #3659, #3675, #3700, #3720, #3754, #3779, #3825, #3850, #3863, #3883, #3900, #3938
- @robinzyb: #3647
- @wanghan-iapcm: #3413, #3458, #3469, #3609, #3611, #3626, #3628, #3639, #3642, #3649, #3650, #3755, #3761
- @wangzyphysics: #3597
New Contributors
- @wangzyphysics made their first contribution in #3597
- @robinzyb made their first contribution in #3647
- @Mancn-Xu made their first contribution in #3567
- @njzjz-bot made their first contribution in #3669
- @cherryWangY made their first contribution in #3877
Full Changelog: v3.0.0a0...v3.0.0b0
For discussion of v3, please go to #3401
v2.2.11
What's Changed
New feature
- feat: apply descriptor exclude_types to env mat stat by @njzjz in #3625
- feat(build): Add Git archives version files by @njzjz-bot in #3669
Enhancement
- style: enable W rules by @njzjz in #3793
- build: unpin tensorflow version on windows by @njzjz in #3721
- Add a reminder for the illegal memory error by @Yi-FanLi in #3822
- lmp: improve error message when compute/fix is not found by @njzjz in #3801
Bugfix
- tf: remove freeze warning for optional nodes by @njzjz in #3381
- fix: set rpath for protobuf by @njzjz in #3636
- fix(tf): apply exclude types to se_atten_v2 switch by @njzjz in #3651
- fix: fix git version detection in docker_package_c.sh by @njzjz in #3658
- fix(tf): fix float32 for exclude_types in se_atten_v2 by @njzjz in #3682
- Fix typo in smooth_type_embdding by @iProzd in #3698
- test: set more lossy precision requirements by @nahso in #3726
- fix: fix ipi package by @njzjz in #3835
- fix(tf): prevent fitting_attr variable scope from becoming fitting_attr_1 by @njzjz in #3930
- fix seeds in se_a and se_atten by @njzjz in #3880
Documentation
- docs: update DPA-1 reference by @njzjz in #3810
- docs: setup uv for readthedocs by @njzjz in #3685
- Clarifiy se_atten_v2 compression doc by @nahso in #3727
- docs: add document equations for se_atten_v2 by @Chengqian-Zhang in #3828
CI/CD
- CI: Accerate GitHub Actions using uv by @njzjz in #3676
- ci: bump ase to 3.23.0 by @njzjz in #3846
- ci(build): use uv for cibuildwheel by @njzjz in #3695
- chore(ci): workaround to retry error decoding response body from uv by @njzjz in #3889
Dependency updates
- build(deps): bump tar from 6.1.14 to 6.2.1 in /source/nodejs by @dependabot in #3714
- build(deps): bump pypa/cibuildwheel from 2.17 to 2.18 by @dependabot in #3777
- build(deps): bump docker/build-push-action from 5 to 6 by @dependabot in #3882
Full Changelog: v2.2.10...v2.2.11
v2.2.10
What's Changed
New features
- Add
max_ckpt_keep
for trainer by @iProzd in #3441 - feat: model devi C/C++ API without nlist by @robinzyb in #3647
Enhancement
- Neighbor stat is 80x accelerated by @njzjz in #3275
- support checkpoint path (instead of directory) in dp freeze by @njzjz in #3254
- add fparam/aparam support for finetune by @njzjz in #3313
- chore(build): move static part of dynamic metadata to pyproject.toml by @njzjz in #3618
- test: add LAMMPS MPI tests by @njzjz in #3572
- support Python 3.12 by @njzjz in #3343
Documentation
- docs: rewrite README; deprecate manually written TOC by @njzjz in #3179
- docs: apply type_one_side=True to
se_a
andse_r
by @njzjz in #3364 - docs: add deprecation notice for the official conda channel and more conda docs by @njzjz in #3462
- docs: Replace quick_start.ipynb with a new version. by @Mancn-Xu in #3567
- issue template: change TF version to backend version by @njzjz in #3244
- chore: remove incorrect memset TODOs by @njzjz in #3600
Bugfix
- c: change the required shape of electric field to nloc * 3 by @njzjz in #3237
- Fix LAMMPS plugin symlink path on macOS platform by @chazeon in #3473
- fix_dplr.cpp delete redundant setup by @shiruosong in #3344
- fix_dplr.cpp set atom->image when pre_force by @shiruosong in #3345
- fix: fix type hint of sel by @njzjz in #3624
- fix: make
se_atten_v2
masking smooth when davg is not zero by @njzjz in #3632 - fix: do not install tf-keras for cu11 by @njzjz in #3444
CI/CD
- detect version in advance before building deepmd-kit-cu11 by @njzjz in #3172
- fix deepmd-kit-cu11 again by @njzjz in #3403
- ban print by @njzjz in #3415
- ci: add linter for markdown, yaml, CSS by @njzjz in #3574
- fix AlmaLinux GPG key error by @njzjz in #3326
- ci: reduce ASLR entropy by @njzjz in #3461
Dependency update
- bump LAMMPS to stable_2Aug2023_update3 by @njzjz in #3399
- build(deps): bump codecov/codecov-action from 3 to 4 by @dependabot in #3231
- build(deps): bump pypa/cibuildwheel from 2.16 to 2.17 by @dependabot in #3487
- pin nvidia-cudnn-cu{11,12} to <9 by @njzjz in #3610
- pin docker actions to major versions by @njzjz in #3238
- build(deps): bump the npm_and_yarn group across 1 directories with 1 update by @dependabot in #3312
- bump scikit-build-core to 0.8 by @njzjz in #3369
- build(deps): bump softprops/action-gh-release from 1 to 2 by @dependabot in #3446
New Contributors
- @shiruosong made their first contribution in #3344
- @robinzyb made their first contribution in #3647
- @Mancn-Xu made their first contribution in #3567
Full Changelog: v2.2.9...v2.2.10
v3.0.0a0
DeePMD-kit v3: A multiple-backend framework for deep potentials
We are excited to announce the first alpha version of DeePMD-kit v3. DeePMD-kit v3 allows you to train and run deep potential models on top of TensorFlow or PyTorch. DeePMD-kit v3 also supports the DPA-2 model, a novel architecture for large atomic models.
Highlights
Multiple-backend framework
DeePMD-kit v3 adds a pluggable multiple-backend framework to provide consistent training and inference experiences between different backends. You can:
- Use the same training data and the input script to train a deep potential model with different backends. Switch backends based on efficiency, functionality, or convenience:
# Training a model using the TensorFlow backend
dp --tf train input.json
dp --tf freeze
# Training a mode using the PyTorch backend
dp --pt train input.json
dp --pt freeze
- Use any model to perform inference via any existing interfaces, including
dp test
, Python/C++/C interface, and third-party packages (dpdata, ASE, LAMMPS, AMBER, Gromacs, i-PI, CP2K, OpenMM, ABACUS, etc). Take an example on LAMMPS:
# run LAMMPS with a TensorFlow backend model
pair_style deepmd frozen_model.pb
# run LAMMPS with a PyTorch backend model
pair_style deepmd frozen_model.pth
# Calculate model deviation using both models
pair_style deepmd frozen_model.pb frozen_model.pth out_file md.out out_freq 100
- Convert models between backends, using
dp convert-backend
, if both backends support a model:
dp convert-backend frozen_model.pb frozen_model.pth
dp convert-backend frozen_model.pth frozen_model.pb
- Add a new backend to DeePMD-kit much more quickly if you want to contribute to DeePMD-kit.
PyTorch backend: a backend designed for large atomic models and new research
We added the PyTorch backend in DeePMD-kit v3 to support the development of new models, especially for large atomic models.
DPA-2 model: Towards a universal large atomic model for molecular and material simulation
DPA-2 model is a novel architecture for Large Atomic Model (LAM) and can accurately represent a diverse range of chemical systems and materials, enabling high-quality simulations and predictions with significantly reduced efforts compared to traditional methods. The DPA-2 model is only implemented in the PyTorch backend. An example configuration is in the examples/water/dpa2
directory.
The DPA-2 descriptor includes two primary components: repinit
and repformer
. The detailed architecture is shown in the following figure.
Training strategies for large atomic models
The PyTorch backend has supported multiple training strategies to develop large atomic models.
Parallel training: Large atomic models have a number of hyper-parameters and complex architecture, so training a model on multiple GPUs is necessary. Benefiting from the PyTorch community ecosystem, the parallel training for the PyTorch backend can be driven by torchrun
, a launcher for distributed data parallel.
torchrun --nproc_per_node=4 --no-python dp --pt train input.json
Multi-task training: Large atomic models are trained against data in a wide scope and at different DFT levels, which requires multi-task training. The PyTorch backend supports multi-task training, sharing the descriptor between different An example is given in examples/water_multi_task/pytorch_example/input_torch.json
.
Finetune: Fine-tune is useful to train a pre-train large model on a smaller, task-specific dataset. The PyTorch backend has supported --finetune
argument in the dp --pt train
command line.
Developing new models using Python and dynamic graphs
Researchers may feel pain about the static graph and the custom C++ OPs from the TensorFlow backend, which sacrifices research convenience for computational performance. The PyTorch backend has a well-designed code structure written using the dynamic graph, which is currently 100% written with the Python language, making extending and debugging new deep potential models easier than the static graph.
Supporting traditional deep potential models
People may still want to use the traditional models already supported by the TensorFlow backend in the PyTorch backend and compare the same model among different backends. We almost rewrote all of the traditional models in the PyTorch backend, which are listed below:
- Features supported:
- Descriptor:
se_e2_a
,se_e2_r
,se_atten
,hybrid
; - Fitting: energy, dipole, polar, fparam/apram support
- Model:
standard
, DPRc - Python inference interface
- C++ inference interface for energy only
- TensorBoard
- Descriptor:
- Features not supported yet:
- Descriptor:
se_e3
,se_atten_v2
,se_e2_a_mask
- Fitting:
dos
- Model:
linear_ener
, DPLR,pairtab
,linear_ener
,frozen
,pairwise_dprc
, ZBL, Spin - Model compression
- Python inference interface for DPLR
- C++ inference interface for tensors and DPLR
- Paralleling training using Horovod
- Descriptor:
- Features not planned:
- Descriptor:
loc_frame
,se_e2_a
+ type embedding,se_a_ebd_v2
- NVNMD
- Descriptor:
Warning
As part of an alpha release, the PyTorch backend's API or user input arguments may change before the first stable version.
DP backend and format: reference backend for other backends
DP is a reference backend for development that uses pure NumPy to implement models without using any heavy deep-learning frameworks. It cannot be used for training but only for Python inference. As a reference backend, it is not aimed at the best performance but only the correct results. The DP backend uses HDF5 to store model serialization data, which is backend-independent.
The DP backend and the serialization data are used in the unit test to ensure different backends have consistent results and can be converted between each other.
In the current version, the DP backend has a similar supporting status to the PyTorch backend, while DPA-1 and DPA-2 are not supported yet.
Authors
The above highlights were mainly contributed by
- Hangrui Bi (@20171130), in #3180
- Chun Cai (@caic99), in #3180
- Junhan Chang (@TablewareBox), in #3180
- Yiming Du (@nahso), in #3180
- Guolin Ke (@guolinke), in #3180
- Xinzijian Liu (@zjgemi), in #3180
- Anyang Peng (@anyangml), in #3362, #3192, #3212, #3210, #3248, #3266, #3281, #3296, #3309, #3314, #3321, #3327, #3338, #3351, #3376, #3385
- Xuejian Qin (@qin2xue3jian4), in #3180
- Han Wang (@wanghan-iapcm), in #3188, #3190, #3208, #3184, #3199, #3202, #3219, #3225, #3232, #3235, #3234, #3241, #3240, #3246, #3260, #3274, #3268, #3279, #3280, #3282, #3295, #3289, #3340, #3352, #3357, #3389, #3391, #3400
- Jinzhe Zeng (@njzjz), in #3171, #3173, #3174, #3179, #3193, #3200, #3204, #3205, #3333, #3360, #3364, #3365, #3169, #3164, #3175, #3176, #3187, #3186, #3191, #3195, #3194, #3196, #3198, #3201, #3207, #3226, #3222, #3220, #3229, #3226, #3239, #3228, #3244, #3243, #3213, #3249, #3250, #3254, #3247, #3253, #3271, #3263, #3258, #3276, #3285, #3286, #3292, #3294, #3293, #3303, #3304, #3308, #3307, #3306, #3316, #3315, #3318, #3323, #3325, #3332, #3331, #3330, #3339, #3335, #3346, #3349, #3350, #3310, #3356, #3361, #3342, #3348, #3358, #3366, #3374, #3370, #3373, #3377, #3382, #3383, #3384, #3386, #3390, #3395, #3394, #3396, #3397
- Chengqian Zhang (@Chengqian-Zhang), in #3180
- Duo Zhang (@iProzd), in #3180, #3203, #3245, #3261, #3262, #3355, #3367, #3359, #3371, #3387, #3388, #3380, #3378
- Xiangyu Zhang (@CaRoLZhangxy), in #3162, #3287, #3337, #3375, #3379
Breaking changes
- Python 3.7 support is dropped. by @njzjz in #3185
- We require all model files to have the correct filename extension for all interfaces so a corresponding backend can load them. TensorFlow model files must end with
.pb
extension. - Python class
DeepTensor
(includingDeepDiople
andDeepPolar
) now returns atomic tensor in the dimension ofnatoms
instead ofnsel_atoms
. by @njzjz in #3390 - For developers: the Python module structure is fully refactored. The old
deepmd
module was moved todeepmd.tf
without other API changes, anddeepmd_utils
was moved todeepmd
without other API changes. by @njzjz in #3177, #3178
Other changes
Enhancement
- Neighbor stat for the TensorFlow backend is 80x accelerated. by @njzjz in #3275
- i-PI: remove normalize_coord by @njzjz in #3257
- LAMMPS: fix_dplr.cpp delete redundant setup and set atom->image when pre_force by @shiruosong in #3344, #3345
- Bump scikit-build-core to 0.8 by @njzjz in #3369
- Bump LAMMPS to stable_2Aug2023_update3 by @njzjz in #3399
- Add fparam/aparam support for fine-tune by @njzjz in #3313
- TF: remove freeze warning for optional nodes by @njzjz in #3381
CI/CD
- Build macos-arm64 wheel on M1 runners by @njzjz in #3206
- Other improvements and fixes to GitHub Actions by @njzjz in #3238, #3283, #3284, #3288, #3290, #3326
- Enable docstring code format by @njzjz in #3267
Bugfix
- Fix TF 2.16 compatibility by @njzjz in #3343
- Detect version in advance before building deepmd-kit-cu11 by @njzjz in #3172
- C API: change the required shape of electric field to nloc * 3 by @njzjz in #3237
New Contributors
- @anyangml made their first contribution in #3192
- @shiruosong made their first contribution in #3344
Full Changelog: https://github.com/deepmodeling/de...