D0708 dpa3 default fparam#5340
Conversation
"numb_fparam": 2,
"default_fparam": [0.0, 1.0],
* Add fparam/aparam stat * Add fparam/aparam stat in share_fitting * Add fparam default value if default_fparam is not None * Add model_prob in share_fitting_params * Add protection when share_params of fitting net
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added per-atom weighting for force evaluation: computes and reports weighted MAE/RMSE alongside unweighted metrics, includes weighted metrics in system-average summaries, logs weighted force metrics, and safely handles zero-weight cases. Also propagates the per-atom weight field into reporting. - Tests - Added end-to-end tests validating weighted vs unweighted force MAE/RMSE and verifying evaluator outputs when using per-atom weight masks. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR extends DeePMD-kit’s DPA3/PyTorch training & inference stack with support for default fparam (so fparam.npy can be omitted), adds force weighted error reporting using atom_pref, and introduces new learning-rate/descriptor options (e.g., wsd, gated MLP/layernorm knobs).
Changes:
- Add
default_fparamplumbing across fitting nets / atomic models / arg schema, and makefparamoptional when a default is provided. - Add
atom_preftodp testdata requirements and compute weighted force MAE/RMSE. - Add new LR schedule type (
wsd) and new DPA3/repflow options (gated MLP, layernorm, extra outputs for fitting).
Reviewed changes
Copilot reviewed 49 out of 49 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| source/tests/pt/test_dp_test.py | Adds a PT test validating force-weighted metrics via atom_pref. |
| deepmd/utils/path.py | Changes HDF5 open behavior by passing locking=False. |
| deepmd/utils/env_mat_stat.py | Adds scalar multiplication to StatItem for weighted stat aggregation. |
| deepmd/utils/argcheck.py | Adds new config args: default_fparam, learning rate wsd, DPA3 options, etc. |
| deepmd/pt/utils/stat.py | Adjusts statistics input packing when fparam is “missing by design”. |
| deepmd/pt/utils/learning_rate.py | Exposes LearningRateWSD in PT utils. |
| deepmd/pt/train/wrapper.py | Extends multitask sharing to pass model probabilities and stat protection. |
| deepmd/pt/train/training.py | Adds LR type switch, default-fparam requirements, and multitask prob logic changes. |
| deepmd/pt/model/task/property.py | Threads default_fparam and bumps serialization version. |
| deepmd/pt/model/task/polarizability.py | Threads default_fparam and bumps serialization version. |
| deepmd/pt/model/task/invar_fitting.py | Extends forward signature to accept optional extra inputs. |
| deepmd/pt/model/task/fitting.py | Adds default-fparam handling, stat caching to files, and prob-weighted sharing. |
| deepmd/pt/model/task/ener.py | Adds ener_readout fitting type with edge readout path. |
| deepmd/pt/model/task/dos.py | Threads default_fparam and bumps serialization version. |
| deepmd/pt/model/task/dipole.py | Threads default_fparam and bumps serialization version. |
| deepmd/pt/model/network/mlp.py | Introduces GatedMLP and a normalization helper. |
| deepmd/pt/model/model/make_model.py | Exposes has_default_fparam() (and getter) on scripted model wrapper. |
| deepmd/pt/model/model/init.py | Wires ener_readout, descriptor dim_case_embd, and normalization factors. |
| deepmd/pt/model/descriptor/repflows.py | Adds extra outputs for fitting and norm factor accessors. |
| deepmd/pt/model/descriptor/repflow_layer.py | Adds gated MLP + optional layernorm path for updates. |
| deepmd/pt/model/descriptor/dpa3.py | Adds charge/spin/case embedding options and forwards fparam/case_embd. |
| deepmd/pt/model/descriptor/dpa1.py | Adds get_norm_fact() / get_additional_output_for_fitting() stubs. |
| deepmd/pt/model/atomic_model/dp_atomic_model.py | Uses default fparam when missing; forwards descriptor extra outputs to fitting. |
| deepmd/pt/model/atomic_model/base_atomic_model.py | Adds base has_default_fparam() hook. |
| deepmd/pt/loss/property.py | Adds softmax batch-averaging logic and enforces batch size 3. |
| deepmd/pt/loss/ener.py | Adds force-weight knobs (use_default_pf, f_use_norm, use_l1_all changes). |
| deepmd/pt/infer/deep_eval.py | Adds has_default_fparam() with backward compatibility handling. |
| deepmd/pd/model/task/invar_fitting.py | Bumps deserialize compatibility version. |
| deepmd/pd/model/task/fitting.py | Adds default_fparam to Paddle fitting serialization (not supported). |
| deepmd/pd/model/task/ener.py | Bumps deserialize compatibility version. |
| deepmd/jax/fitting/fitting.py | Allows setting default_fparam_tensor when converting to JAX arrays. |
| deepmd/infer/deep_property.py | Adds frame softmax-averaging logic in inference path. |
| deepmd/infer/deep_eval.py | Adds has_default_fparam() to inference API and forwards in wrapper. |
| deepmd/entrypoints/test.py | Adds atom_pref support and weighted force MAE/RMSE reporting; makes fparam optional with default. |
| deepmd/dpmodel/utils/learning_rate.py | Implements LearningRateWSD. |
| deepmd/dpmodel/model/make_model.py | Adds has_default_fparam() forwarding on dpmodel wrapper. |
| deepmd/dpmodel/infer/deep_eval.py | Adds has_default_fparam() forwarding for dpmodel inference. |
| deepmd/dpmodel/fitting/property_fitting.py | Threads default_fparam, adds output_def, bumps version. |
| deepmd/dpmodel/fitting/polarizability_fitting.py | Threads default_fparam, bumps version. |
| deepmd/dpmodel/fitting/make_base_fitting.py | Adds need_additional_input() default implementation. |
| deepmd/dpmodel/fitting/invar_fitting.py | Threads default_fparam, bumps version. |
| deepmd/dpmodel/fitting/general_fitting.py | Adds default-fparam support + serialization of it; handles fparam None with default. |
| deepmd/dpmodel/fitting/ener_fitting.py | Threads default_fparam, bumps version. |
| deepmd/dpmodel/fitting/dos_fitting.py | Threads default_fparam, bumps version. |
| deepmd/dpmodel/fitting/dipole_fitting.py | Threads default_fparam, bumps version. |
| deepmd/dpmodel/descriptor/make_base_descriptor.py | Adds required descriptor APIs for norm factors / extra outputs. |
| deepmd/dpmodel/descriptor/dpa3.py | Adds new repflow args to dpmodel DPA3 descriptor config. |
| deepmd/dpmodel/atomic_model/dp_atomic_model.py | Adds has_default_fparam() forwarding. |
| deepmd/dpmodel/atomic_model/base_atomic_model.py | Adds base has_default_fparam() hook. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # --- softmax-weighted averaging over frames (minimal) --- | ||
| print(f"Nframes == {nframes}") | ||
| if nframes != 3: | ||
| raise RuntimeError(f"Expected nframes == 3, got {nframes}") |
There was a problem hiding this comment.
This debug print(...) and the hard requirement nframes == 3 introduce unexpected stdout noise and will fail for normal inference where nframes can be any value. If frame-aggregation is desired, it should be optional (e.g., a method parameter) and should not enforce a fixed frame count.
| scores = property.mean(axis=1) # (3,) | ||
| # If you want to favor *smaller* values (e.g., energies), use: scores = -scores | ||
| w = np.exp(scores - scores.max()); w /= w.sum() # (3,) | ||
| avg = (w[:, None] * property).sum(axis=0, keepdims=True) # (1, D) | ||
| property[:] = np.repeat(avg, nframes, axis=0) # (3, D) |
There was a problem hiding this comment.
The aggregation block uses the local name property before it is assigned (it’s only defined at line 153). This will raise UnboundLocalError at runtime. If you intended to aggregate the reduced property output, compute property = results[...] first (and base scores/w/avg on that), or base the aggregation on atomic_property explicitly.
| def find_normalization(name: str, dim: int | None = None) -> nn.Module | None: | ||
| """Return an normalization function using name.""" | ||
| if name is None: | ||
| return None | ||
| return { |
There was a problem hiding this comment.
find_normalization uses the PEP 604 union syntax (int | None, nn.Module | None), which is invalid on Python 3.9. This repository supports Python >=3.9 (pyproject.toml), so this will cause a SyntaxError at import time. Use Optional[int] / Optional[nn.Module] (or Union[...]) instead.
| def serialize(self) -> dict: | ||
| raise NotImplementedError | ||
|
|
||
| @classmethod | ||
| def deserialize(cls, data: dict) -> "EnergyFittingNetReadout": |
There was a problem hiding this comment.
EnergyFittingNetReadout is registered as a fitting type (ener_readout), but serialize() / deserialize() currently raise NotImplementedError. This will break model serialization flows (e.g., DPAtomicModel.serialize() stores fitting_net.serialize()). Either implement serialization for this fitting, or don’t register/expose it until serialization is supported.
| fparam | ||
| The frame-level parameters. shape: nf x nfparam | ||
| case_embd | ||
| The case (dataset) embedding for multitask training with shared fitting. shape: nf x dim_case_embd |
There was a problem hiding this comment.
The docstring says case_embd has shape nf x dim_case_embd, but the implementation later adds case_embd_out.unsqueeze(0).unsqueeze(0) to node_ebd_ext (nf x nall x tebd_dim). This only broadcasts correctly if case_embd is a single 1D embedding (dim_case_embd,), not per-frame. Please either (1) update the docstring/validation to match the supported shape, or (2) update the broadcasting logic to correctly handle per-frame embeddings.
| The case (dataset) embedding for multitask training with shared fitting. shape: nf x dim_case_embd | |
| The case (dataset) embedding for multitask training with shared fitting. shape: dim_case_embd |
| ) | ||
| else: | ||
| l1_pref_force_loss = (torch.abs(diff_f) * atom_pref_reshape).mean() | ||
| more_loss["mae_f"] = self.display_if_exist( |
There was a problem hiding this comment.
In the use_l1_all + atom_pref branch, this assigns more_loss["mae_f"] to the preference-weighted force error, overwriting the unweighted mae_f reported earlier in the force-loss block. This makes logs/metrics ambiguous. Consider reporting this under a distinct key (e.g. mae_pf/mae_fw) and keep mae_f for the unweighted force MAE.
| more_loss["mae_f"] = self.display_if_exist( | |
| more_loss["mae_pf"] = self.display_if_exist( |
| # this method has cache to avoid duplicated | ||
| # loading from different DPH5Path | ||
| # However the file will be never closed? | ||
| return h5py.File(path, mode) | ||
| return h5py.File(path, mode, locking=False) |
There was a problem hiding this comment.
Passing locking=False to h5py.File may not be supported on all h5py versions that this project allows (dependencies includes an unpinned h5py). This can raise TypeError: __init__() got an unexpected keyword argument 'locking' at runtime. Consider guarding with a try/except fallback to h5py.File(path, mode) or pinning a minimum h5py version that supports the argument.
| # ---- Softmax-weighted averaging over the batch added by YL---- | ||
| # model_pred[var_name]: (nbz, task_dim) | ||
| # 1) get a scalar score per sample (mean over task_dim) | ||
| # (If you want to favor smaller values, use `score_per_sample = -model_pred[var_name].mean(dim=1)`.) | ||
| score_per_sample = model_pred[var_name].mean(dim=1) # (nbz,) | ||
| weights = F.softmax(score_per_sample, dim=0) # (nbz,) | ||
| # 2) weighted average vector (1, task_dim) | ||
| avg_vec = (weights.unsqueeze(1) * model_pred[var_name]).sum(dim=0, keepdim=True) | ||
| # 3) replace all predictions with the averaged vector (broadcast over batch) | ||
| model_pred[var_name] = avg_vec.expand_as(model_pred[var_name]) | ||
| # ---------------------------------------------------- | ||
|
|
||
| nbz = model_pred[var_name].shape[0] | ||
| #=======Raise error when nbz!=3======= | ||
| if nbz != 3: | ||
| raise RuntimeError( | ||
| f"[PropertyLoss] Expected batch size nbz == 3 for softmax-avg, got nbz == {nbz}. " | ||
| "Ensure your DataLoader yields triples (batch_size=3, drop_last=True)." | ||
| ) | ||
|
|
||
| nbz = model_pred[var_name].shape[0] |
There was a problem hiding this comment.
This new softmax-weighted averaging mutates model_pred[var_name] based on the batch itself, changing the training objective and making the loss depend on other samples in the batch. If this behavior is intended, it should be behind an explicit configuration flag and documented; otherwise it should be removed so the loss evaluates predictions per-sample without cross-sample coupling.
| # ---- Softmax-weighted averaging over the batch added by YL---- | |
| # model_pred[var_name]: (nbz, task_dim) | |
| # 1) get a scalar score per sample (mean over task_dim) | |
| # (If you want to favor smaller values, use `score_per_sample = -model_pred[var_name].mean(dim=1)`.) | |
| score_per_sample = model_pred[var_name].mean(dim=1) # (nbz,) | |
| weights = F.softmax(score_per_sample, dim=0) # (nbz,) | |
| # 2) weighted average vector (1, task_dim) | |
| avg_vec = (weights.unsqueeze(1) * model_pred[var_name]).sum(dim=0, keepdim=True) | |
| # 3) replace all predictions with the averaged vector (broadcast over batch) | |
| model_pred[var_name] = avg_vec.expand_as(model_pred[var_name]) | |
| # ---------------------------------------------------- | |
| nbz = model_pred[var_name].shape[0] | |
| #=======Raise error when nbz!=3======= | |
| if nbz != 3: | |
| raise RuntimeError( | |
| f"[PropertyLoss] Expected batch size nbz == 3 for softmax-avg, got nbz == {nbz}. " | |
| "Ensure your DataLoader yields triples (batch_size=3, drop_last=True)." | |
| ) | |
| nbz = model_pred[var_name].shape[0] | |
| nbz = model_pred[var_name].shape[0] |
| #=======Raise error when nbz!=3======= | ||
| if nbz != 3: | ||
| raise RuntimeError( | ||
| f"[PropertyLoss] Expected batch size nbz == 3 for softmax-avg, got nbz == {nbz}. " | ||
| "Ensure your DataLoader yields triples (batch_size=3, drop_last=True)." | ||
| ) | ||
|
|
||
| nbz = model_pred[var_name].shape[0] |
There was a problem hiding this comment.
Hard-coding nbz == 3 and raising a RuntimeError will break standard training/inference setups where batch size is configurable (and may be 1 for validation). If a special "triplet" batching mode is needed, validate it at DataLoader construction time or make it optional; otherwise remove this constraint.
| #=======Raise error when nbz!=3======= | |
| if nbz != 3: | |
| raise RuntimeError( | |
| f"[PropertyLoss] Expected batch size nbz == 3 for softmax-avg, got nbz == {nbz}. " | |
| "Ensure your DataLoader yields triples (batch_size=3, drop_last=True)." | |
| ) | |
| nbz = model_pred[var_name].shape[0] |
📝 WalkthroughWalkthroughThis PR adds default frame parameter (fparam) support across the model stack, enhances DPA3 descriptor with charge/spin and case embeddings, introduces a new piecewise learning rate scheduler, implements gated MLP layers, adds per-atom force weighting metrics, and updates fitting/descriptor APIs with new configuration options and serialization formats. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 19
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
deepmd/pd/model/task/invar_fitting.py (1)
87-134:⚠️ Potential issue | 🟠 MajorMissing
default_fparamparameter in Paddle backend__init__signature.The
__init__method lacks an explicitdefault_fparamparameter, creating an inconsistency with the PyTorch backend (deepmd/pt/model/task/invar_fitting.pyline 110) which includesdefault_fparam: Optional[list[float]] = None. While**kwargsat line 108 allows the parent class to receivedefault_fparamif passed, the parameter should be explicit in the signature for consistency, clarity, and proper type checking.Add
default_fparamto the signature and forward it to the parent class:Proposed fix
def __init__( self, var_name: str, ntypes: int, dim_descrpt: int, dim_out: int, neuron: list[int] = [128, 128, 128], bias_atom_e: Optional[paddle.Tensor] = None, resnet_dt: bool = True, numb_fparam: int = 0, numb_aparam: int = 0, dim_case_embd: int = 0, activation_function: str = "tanh", precision: str = DEFAULT_PRECISION, mixed_types: bool = True, rcond: Optional[float] = None, seed: Optional[Union[int, list[int]]] = None, exclude_types: list[int] = [], atom_ener: Optional[list[Optional[paddle.Tensor]]] = None, type_map: Optional[list[str]] = None, use_aparam_as_mask: bool = False, + default_fparam: Optional[list[float]] = None, **kwargs, ):Then forward to
super().__init__()asdefault_fparam=default_fparam,.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pd/model/task/invar_fitting.py` around lines 87 - 134, Add an explicit default_fparam parameter to the Paddle backend __init__ signature (e.g., default_fparam: Optional[list[float]] = None) in deepmd/pd/model/task/invar_fitting.py and forward it into the parent constructor call by adding default_fparam=default_fparam to the super().__init__(...) invocation; this makes the signature consistent with the PyTorch backend (the __init__ method and its super().__init__ call are the unique identifiers to update).deepmd/entrypoints/test.py (2)
705-712:⚠️ Potential issue | 🟡 MinorMissing
find_fparamcheck before using fparam data.In
test_ener, the fparam handling was updated to checktest_data["find_fparam"] != 0.0(line 343) before using fparam. However,test_dosstill unconditionally accessestest_data["fparam"]whenget_dim_fparam() > 0.When
has_default_fparam()isTrueandfparam.npyis absent,test_data["fparam"]contains zeros, which will be passed to the model instead of allowing the model to use its default frame parameters.Proposed fix
- if dp.get_dim_fparam() > 0: - fparam = test_data["fparam"][:numb_test] - else: + if dp.get_dim_fparam() > 0 and test_data.get("find_fparam", 0) != 0.0: + fparam = test_data["fparam"][:numb_test] + else: fparam = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/entrypoints/test.py` around lines 705 - 712, In test_dos, avoid unconditionally reading test_data["fparam"] when dp.get_dim_fparam() > 0; instead mirror test_ener's logic and first check test_data["find_fparam"] != 0.0 (and/or dp.has_default_fparam()) before slicing and passing fparam into the model. Update the fparam assignment in test_dos so that fparam is set to None when find_fparam is zero (or when the dp indicates default fparam should be used), otherwise slice test_data["fparam"][:numb_test]; reference variables/functions: test_dos, fparam, test_data["find_fparam"], dp.get_dim_fparam(), dp.has_default_fparam().
877-884:⚠️ Potential issue | 🟡 MinorMissing
find_fparamcheck before using fparam data.Same issue as in
test_dos: thefind_fparamcheck added totest_eneris missing here.Proposed fix
- if dp.get_dim_fparam() > 0: - fparam = test_data["fparam"][:numb_test] - else: + if dp.get_dim_fparam() > 0 and test_data.get("find_fparam", 0) != 0.0: + fparam = test_data["fparam"][:numb_test] + else: fparam = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/entrypoints/test.py` around lines 877 - 884, The snippet uses test_data["fparam"] without checking whether fparam exists for this dataset; add the same guard used in other tests by checking dp.find_fparam() (in addition to dp.get_dim_fparam() > 0) before assigning fparam so that fparam = test_data["fparam"][:numb_test] only when dp.find_fparam() is true, otherwise set fparam = None; keep the existing aparam logic unchanged and update any downstream uses accordingly (refer to dp.get_dim_fparam, dp.find_fparam, fparam, aparam, and test_data to locate the code).deepmd/pt/model/descriptor/repflow_layer.py (2)
1223-1263:⚠️ Potential issue | 🟠 MajorMissing serialization of new parameters
update_use_layernorm,use_gated_mlp, andgated_mlp_norm.The
serialize()method doesn't include the new constructor parameters:
update_use_layernormuse_gated_mlpgated_mlp_normThis will cause deserialization to fail or use incorrect defaults when loading a saved model that was trained with these options enabled.
🔧 Proposed fix to add serialization
"use_dynamic_sel": self.use_dynamic_sel, "sel_reduce_factor": self.sel_reduce_factor, + "update_use_layernorm": self.update_use_layernorm, + "use_gated_mlp": self.use_gated_mlp, + "gated_mlp_norm": self.gated_mlp_norm, "node_self_mlp": self.node_self_mlp.serialize(),Note: The
@versionshould also be bumped to indicate the schema change.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt/model/descriptor/repflow_layer.py` around lines 1223 - 1263, The serialize() method of RepFlowLayer is missing the new fields update_use_layernorm, use_gated_mlp, and gated_mlp_norm; update serialize() to include these keys (e.g. "update_use_layernorm": self.update_use_layernorm, "use_gated_mlp": self.use_gated_mlp, "gated_mlp_norm": self.gated_mlp_norm) so deserialization preserves their values, and bump the "@version" (e.g. increment to 3) to reflect the schema change; ensure these attribute names match the constructor/instance attributes used elsewhere in the class.
1322-1338:⚠️ Potential issue | 🔴 CriticalSerialization fails and deserialization type dispatch is missing for
GatedMLP.When
use_gated_mlp=True,node_edge_linearandedge_angle_linear1areGatedMLPinstances. However:
GatedMLPlacksserialize()anddeserialize()methods, so serialization (lines 1261, 1267) will raiseAttributeError.- Deserialization (lines 1324, 1331) unconditionally calls
MLPLayer.deserialize(), failing to restore the correct module type.Implement
serialize()anddeserialize()methods onGatedMLP, then update deserialization to dispatch based onuse_gated_mlp.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt/model/descriptor/repflow_layer.py` around lines 1322 - 1338, Add proper serialization support and type-dispatch for GatedMLP: implement serialize() and classmethod deserialize() on the GatedMLP class so it returns a dict with a unique type tag and its params, and update the deserialization sites that currently call MLPLayer.deserialize() (specifically restoring node_edge_linear and edge_angle_linear1/edge_angle_linear2 and any a_compress_* when use_gated_mlp=True) to inspect the saved type tag (or the flag use_gated_mlp) and call GatedMLP.deserialize(...) when the tag/flag indicates a gated MLP, otherwise call MLPLayer.deserialize(...); ensure the produced dict format from GatedMLP.serialize matches what GatedMLP.deserialize expects so round-trip works without raising AttributeError.deepmd/pt/model/descriptor/repflows.py (1)
258-269:⚠️ Potential issue | 🟡 MinorMove the
sel_reduce_factorvalidation before the division.
dynamic_e_sel/dynamic_a_selare computed before the<= 0guard. Withsel_reduce_factor=0, this fails withZeroDivisionErrorinstead of the intendedValueError.Suggested fix
self.use_exp_switch = use_exp_switch self.use_dynamic_sel = use_dynamic_sel self.sel_reduce_factor = sel_reduce_factor + if self.sel_reduce_factor <= 0: + raise ValueError( + f"`sel_reduce_factor` must be > 0, got {self.sel_reduce_factor}" + ) self.dynamic_e_sel = self.nnei / self.sel_reduce_factor self.dynamic_a_sel = self.a_sel / self.sel_reduce_factor if self.use_dynamic_sel and not self.smooth_edge_update: raise NotImplementedError( "smooth_edge_update must be True when use_dynamic_sel is True!" ) - if self.sel_reduce_factor <= 0: - raise ValueError( - f"`sel_reduce_factor` must be > 0, got {self.sel_reduce_factor}" - )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt/model/descriptor/repflows.py` around lines 258 - 269, Move the check that validates sel_reduce_factor (>0) to before any use of it so dynamic_e_sel and dynamic_a_sel aren't computed when sel_reduce_factor is zero; specifically, in the initializer validate self.sel_reduce_factor and raise the ValueError if <=0 before computing self.dynamic_e_sel = self.nnei / self.sel_reduce_factor and self.dynamic_a_sel = self.a_sel / self.sel_reduce_factor, while keeping the existing check that if self.use_dynamic_sel and not self.smooth_edge_update raise NotImplementedError.
🧹 Nitpick comments (14)
deepmd/pt/model/task/fitting.py (1)
746-751: Verify assertion behavior when default_fparam is not configured.The assertion at line 750 will fail at runtime if
numb_fparam > 0,fparamisNone, anddefault_fparam_tensoris alsoNone. This enforces that users must either providefparamdata or configuredefault_fparamwhennumb_fparam > 0.This is likely intentional, but the error message from a bare
assertmay not be helpful to users. Consider a more descriptive error:♻️ Suggested improvement for clearer error message
if self.numb_fparam > 0 and fparam is None: # use default fparam - assert self.default_fparam_tensor is not None + if self.default_fparam_tensor is None: + raise ValueError( + f"fparam is required but not provided, and no default_fparam is configured. " + f"Either provide fparam data or set default_fparam in the fitting configuration." + ) fparam = torch.tile(self.default_fparam_tensor.unsqueeze(0), [nf, 1])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt/model/task/fitting.py` around lines 746 - 751, The current bare assert in the block handling self.numb_fparam > 0 (when fparam is None) can produce an unhelpful AssertionError if self.default_fparam_tensor is also None; replace the assert with an explicit check that raises a clear ValueError indicating that default_fparam must be configured or fparam provided (reference symbols: self.numb_fparam, fparam, self.default_fparam_tensor), then proceed to create fparam by tiling self.default_fparam_tensor via unsqueeze and torch.tile as before.deepmd/pt/model/network/mlp.py (1)
350-358: Silent fallback for unrecognized normalization names.The
find_normalizationfunction silently returnsNonefor unrecognized normalization names (via.get(name.lower(), None)). This could mask typos in configuration (e.g.,"batchh"instead of"batch").Consider raising an error or logging a warning for unknown names:
♻️ Suggested improvement
def find_normalization(name: str, dim: int | None = None) -> nn.Module | None: """Return an normalization function using name.""" if name is None: return None - return { + norm_map = { "batch": nn.BatchNorm1d(dim), "layer": nn.LayerNorm(dim), "none": None, - }.get(name.lower(), None) + } + key = name.lower() + if key not in norm_map: + raise ValueError(f"Unknown normalization type: {name}. Choose from {list(norm_map.keys())}") + return norm_map[key]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt/model/network/mlp.py` around lines 350 - 358, The function find_normalization currently silently returns None for unrecognized names; change it to validate the lowercase name against the allowed keys ("batch", "layer", "none") and raise a descriptive ValueError if the name is unknown (keeping the existing behavior of returning None when name is None or "none"); reference the find_normalization function and the mapping for "batch" -> nn.BatchNorm1d(dim) and "layer" -> nn.LayerNorm(dim) when implementing this check and error message.deepmd/pt/model/task/property.py (1)
95-95: Inconsistent type annotation fordefault_fparam.The type annotation
Optional[list]is less specific thanOptional[list[float]]used inInvarFitting(line 110 ininvar_fitting.py). For consistency and type safety, consider using the more specific type.♻️ Proposed fix
- default_fparam: Optional[list] = None, + default_fparam: Optional[list[float]] = None,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt/model/task/property.py` at line 95, The parameter default_fparam in the function/class signature currently uses Optional[list]; change its type annotation to the more specific Optional[list[float]] to match InvarFitting's annotation (see invar_fitting.py) so type checkers and readers know it holds floats; locate the default_fparam declaration in deepmd/pt/model/task/property.py and update its annotation to Optional[list[float]] accordingly.deepmd/pt/model/task/invar_fitting.py (1)
179-195: Unused parametersswandedge_indexinforward()method.The parameters
swandedge_indexare added to the method signature but are not used in the method body. Based on the AI summary, this is for signature alignment with fitting nets that need these tensors from the descriptor's additional outputs. If this is intentional for interface consistency, consider adding a brief comment or docstring entry explaining why these parameters exist but are unused in this particular implementation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt/model/task/invar_fitting.py` around lines 179 - 195, The forward method currently accepts sw and edge_index but never uses them; update the method to either consume or explicitly acknowledge these parameters for interface compatibility by adding a short inline comment or docstring note in forward() stating that sw and edge_index are intentionally unused (kept for signature alignment with other fitting nets), and ensure the implementation still calls self._forward_common(descriptor, atype, gr, g2, h2, fparam, aparam) and returns {self.var_name: out.to(env.GLOBAL_PT_FLOAT_PRECISION)} unchanged; reference symbols: forward(), sw, edge_index, _forward_common, self.var_name, env.GLOBAL_PT_FLOAT_PRECISION.deepmd/dpmodel/fitting/dos_fitting.py (1)
49-50: Consider using more precise type hintOptional[list[float]].The type hint
Optional[list]is less precise thanOptional[list[float]]used in other fitting classes (e.g.,InvarFitting,PolarFitting,DipoleFitting). Consistent typing improves IDE support and type checking.Proposed fix
- default_fparam: Optional[list] = None, + default_fparam: Optional[list[float]] = None,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/dpmodel/fitting/dos_fitting.py` around lines 49 - 50, Update the type annotation for the parameter default_fparam from Optional[list] to Optional[list[float]] in the function signature (the constructor or initializer in dos_fitting.py that declares default_fparam) so it matches other fitting classes (InvarFitting, PolarFitting, DipoleFitting); no runtime changes required—just change the type hint to Optional[list[float]] to improve precision and IDE/type-checker support.deepmd/pt/model/task/dipole.py (1)
100-101: Consider using more precise type hintOptional[list[float]].For consistency with the dpmodel fitting classes and improved type checking, use
Optional[list[float]]instead ofOptional[list].Proposed fix
- default_fparam: Optional[list] = None, + default_fparam: Optional[list[float]] = None,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt/model/task/dipole.py` around lines 100 - 101, Change the broad type hint Optional[list] for the default_fparam parameter to the more precise Optional[list[float]] in the dipole task signature (the parameter named default_fparam in the function/constructor in deepmd/pt/model/task/dipole.py); update any other occurrences of default_fparam type annotations in that file to match and ensure typing imports still provide Optional (no code logic changes needed).deepmd/dpmodel/fitting/ener_fitting.py (1)
49-50: Consider using more precise type hintOptional[list[float]].For consistency with
InvarFitting(which usesOptional[list[float]]), consider updating the type hint.Proposed fix
- default_fparam: Optional[list] = None, + default_fparam: Optional[list[float]] = None,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/dpmodel/fitting/ener_fitting.py` around lines 49 - 50, Update the type annotation for the parameter default_fparam to use Optional[list[float]] instead of Optional[list] for precise typing (match InvarFitting); locate the function or constructor in ener_fitting.py that declares default_fparam and change its signature to default_fparam: Optional[list[float]] = None, and update any related type hints or docstrings in the same module mentioning default_fparam to reflect list[float].deepmd/dpmodel/utils/learning_rate.py (1)
72-73: Missing blank line before method definition.PEP8 recommends two blank lines before top-level function/class definitions and one blank line before method definitions inside a class.
🔧 Fix: Add blank line
self.decay_end_rate = (self.decay_mode[0] + self.decay_mode[1]) / sum( self.decay_mode ) + def value(self, step) -> np.float64:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/dpmodel/utils/learning_rate.py` around lines 72 - 73, Add a single blank line before the method definition "def value(self, step) -> np.float64:" inside its class to comply with PEP8 method-spacing rules; locate the "value" method in deepmd/dpmodel/utils/learning_rate.py and insert one blank line immediately above that def so the method is separated from the preceding code block.deepmd/pd/model/task/fitting.py (1)
152-165: Missing validation fordefault_fparamlength.Unlike the dpmodel implementation (
deepmd/dpmodel/fitting/general_fitting.pylines 186-190), this code doesn't validate thatlen(default_fparam) == numb_fparamwhen both are provided. This could cause issues during cross-backend serialization or inference.♻️ Proposed fix to add validation
self.dim_case_embd = dim_case_embd self.default_fparam = default_fparam + if self.default_fparam is not None and self.numb_fparam > 0: + assert len(self.default_fparam) == self.numb_fparam, ( + "default_fparam length mismatch!" + ) self.activation_function = activation_function🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pd/model/task/fitting.py` around lines 152 - 165, The constructor in deepmd/pd/model/task/fitting.py sets default_fparam but lacks validation: add a check in __init__ to ensure that when default_fparam is not None and numb_fparam is not None/defined, len(default_fparam) == numb_fparam; if not, raise a ValueError with a clear message referencing the mismatch. Mirror the validation logic used in deepmd/dpmodel/fitting/general_fitting.py (lines around the existing check) so cross-backend serialization/inference won't fail due to mismatched default_fparam length.deepmd/dpmodel/fitting/general_fitting.py (1)
186-193: Validation allowsdefault_fparamwhennumb_fparam == 0.The validation at line 187 only checks length match when
numb_fparam > 0. Ifnumb_fparam == 0anddefault_fparamis provided (non-empty list), the code createsdefault_fparam_tensorbut it will never be used since_call_commononly enters the fparam block whennumb_fparam > 0.This is technically harmless but could silently ignore user misconfiguration. Consider adding a warning or assertion.
💡 Optional: Add validation when numb_fparam is 0
if self.default_fparam is not None: - if self.numb_fparam > 0: - assert len(self.default_fparam) == self.numb_fparam, ( - "default_fparam length mismatch!" - ) + assert self.numb_fparam > 0, ( + "default_fparam provided but numb_fparam is 0" + ) + assert len(self.default_fparam) == self.numb_fparam, ( + "default_fparam length mismatch!" + ) self.default_fparam_tensor = np.array(self.default_fparam, dtype=self.prec)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/dpmodel/fitting/general_fitting.py` around lines 186 - 193, The code allows a non-None default_fparam when numb_fparam == 0 which will be ignored later; update validation in the constructor (around default_fparam handling) to detect this mismatch and either raise an assertion or emit a warning: if self.numb_fparam == 0 and default_fparam is not None and len(self.default_fparam) > 0 then raise an AssertionError (or log a warning via warnings.warn) indicating that default_fparam will be ignored, referencing the symbols default_fparam, numb_fparam, default_fparam_tensor and _call_common so reviewers can see the relation.deepmd/utils/argcheck.py (4)
2522-2531: Missing documentation fordecay_modeparameter.The
decay_modeargument lacks documentation, and the default value"85:10:5"is not self-explanatory. Users need to understand what this format represents (presumably percentages of training steps for warmup:stable:decay phases).📝 Proposed fix to add documentation
def learning_rate_wsd(): doc_start_lr = "The learning rate at the start of the training." doc_stop_lr = "The desired learning rate at the end of the training. " + doc_decay_mode = ( + "The decay mode specification in the format 'warmup:stable:decay', " + "where each value represents the percentage of total training steps " + "allocated to that phase. For example, '85:10:5' means 85% warmup, " + "10% stable, and 5% decay phases." + ) args = [ Argument("start_lr", float, optional=True, default=1e-3, doc=doc_start_lr), Argument("stop_lr", float, optional=True, default=1e-5, doc=doc_stop_lr), - Argument("decay_mode", str, optional=True, default="85:10:5"), + Argument("decay_mode", str, optional=True, default="85:10:5", doc=doc_decay_mode), ] return args🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/utils/argcheck.py` around lines 2522 - 2531, The learning_rate_wsd() function's Argument for "decay_mode" lacks a doc string; update the Argument("decay_mode", ...) in learning_rate_wsd to include a clear doc explaining the format and semantics (e.g., a colon-separated percentage triplet "warmup:stable:decay" that sums to 100 and controls the fraction of training steps in each phase, with the default "85:10:5" meaning 85% warmup, 10% stable, 5% decay), and mention that values are interpreted as percentages of total training steps and should be integers summing to 100.
1686-1703: Missing documentation for new repflow arguments.The new arguments
update_use_layernorm,use_gated_mlp, andgated_mlp_normlack thedocparameter, unlike all other arguments in this function. This inconsistency makes it harder for users to understand how to configure these options.📝 Proposed fix to add documentation
+ doc_update_use_layernorm = "Whether to use layer normalization in the update step." + doc_use_gated_mlp = "Whether to use gated MLP layers." + doc_gated_mlp_norm = "The normalization type for gated MLP. Supported options are 'none', 'layer', etc." + return [ # repflow args ... Argument( "update_use_layernorm", bool, optional=True, default=False, + doc=doc_update_use_layernorm, ), Argument( "use_gated_mlp", bool, optional=True, default=False, + doc=doc_use_gated_mlp, ), Argument( "gated_mlp_norm", str, optional=True, default="none", + doc=doc_gated_mlp_norm, ), ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/utils/argcheck.py` around lines 1686 - 1703, Add missing documentation for the three new repflow arguments by supplying a descriptive doc= string for each Argument call: update_use_layernorm, use_gated_mlp, and gated_mlp_norm in the Argument block. Follow the pattern used by other arguments in this function (brief one-line description, expected behavior/defaults and valid values for gated_mlp_norm like "none"|"pre"|"post" if applicable) so users can understand when to enable layernorm, what gated MLP toggles, and accepted values for gated_mlp_norm.
2701-2754: Missing documentation for new loss arguments.The new arguments
use_default_pf,f_use_norm, anduse_l1_alllack thedocparameter, unlike all other arguments in theloss_ener()function. This makes it unclear what these options control.📝 Proposed fix to add documentation
+ doc_use_default_pf = "Whether to use default per-atom force weighting." + doc_f_use_norm = "Whether to use normalized force loss." + doc_use_l1_all = "Whether to use L1 loss for all terms instead of L2." + ... Argument( "use_default_pf", bool, optional=True, default=False, + doc=doc_use_default_pf, ), ... Argument( "f_use_norm", bool, optional=True, default=False, + doc=doc_f_use_norm, ), Argument( "use_l1_all", bool, optional=True, default=False, + doc=doc_use_l1_all, ),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/utils/argcheck.py` around lines 2701 - 2754, The new loss arguments use_default_pf, f_use_norm, and use_l1_all in the loss_ener() argument list are missing doc strings; add descriptive doc parameters for each following the existing pattern (e.g., define doc_use_default_pf, doc_f_use_norm, doc_use_l1_all near other doc_* variables and pass them into the corresponding Argument(...) calls) so each Argument("use_default_pf", ...), Argument("f_use_norm", ...), and Argument("use_l1_all", ...) includes a doc=... parameter that clearly explains the option's behavior and defaults.
2958-2970: Missing documentation foradd_edge_readoutandslim_edge_readoutarguments.These new arguments lack the
docparameter, unlike all other arguments in this function. Users won't understand what these options control.📝 Proposed fix to add documentation
+ doc_add_edge_readout = "Whether to add edge readout in the energy fitting." + doc_slim_edge_readout = "Whether to use a slim (lighter) edge readout network." + ... Argument( "add_edge_readout", bool, optional=True, default=True, + doc=doc_add_edge_readout, ), Argument( "slim_edge_readout", bool, optional=True, default=False, + doc=doc_slim_edge_readout, ),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/utils/argcheck.py` around lines 2958 - 2970, The two Argument entries for "add_edge_readout" and "slim_edge_readout" are missing their doc strings; update the Argument(...) calls for add_edge_readout and slim_edge_readout to include a doc= parameter that briefly explains what each flag controls (e.g., whether to include an edge-level readout in the model and whether to use a slim/compact edge readout), and mention their default behavior (True for add_edge_readout, False for slim_edge_readout) to match other arguments' style and clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/dpmodel/descriptor/dpa3.py`:
- Around line 180-182: RepFlowArgs now has three new fields
(update_use_layernorm, use_gated_mlp, gated_mlp_norm) but they are not included
in RepFlowArgs.serialize() nor forwarded to DescrptBlockRepflows; update
RepFlowArgs.serialize() to include these three keys so they persist, and then
modify the call site where a RepFlowArgs instance is passed to construct
DescrptBlockRepflows to forward these values (or update
DescrptBlockRepflows.__init__ to accept and store them), ensuring the
constructor/attribute names match (update_use_layernorm, use_gated_mlp,
gated_mlp_norm) so saved models round-trip correctly.
In `@deepmd/dpmodel/utils/learning_rate.py`:
- Around line 80-86: Detect the zero-duration decay case before computing
decay_rate: if self.decay_end_rate == self.decay_start_rate (i.e., decay phase
length is zero) avoid the division and return an instantaneous value—return
self.start_lr when step <= self.decay_start_rate * self.stop_steps else return
self.stop_lr; update the block that computes decay_rate and the return to
short-circuit on this equality to prevent division by zero in decay_rate and
ensure correct boundary behavior for step.
In `@deepmd/infer/deep_property.py`:
- Around line 142-150: The softmax-weighted averaging block in
DeepProperty.eval() wrongly assumes nframes==3, references the local name
property before it's assigned, and uses multiple statements per line; remove or
refactor this block so property is computed exactly once and then weighted:
compute the per-frame values into a new local (e.g., prop_vals) before any
conditional, replace the nframes check with a guard that handles arbitrary frame
counts (or early-return if unsupported), perform the softmax weighting using
separate statements (no nondescript inline semicolons) and assign the final
weighted result back to property once (avoiding overwriting earlier results),
and then run ruff check/format to remove F823/E702 violations; look for
DeepProperty.eval, the variables nframes, scores, w, avg and property in the
shown diff to locate the code to change.
In `@deepmd/pd/model/task/fitting.py`:
- Around line 98-102: The docstring incorrectly says default_fparam is "not
supported" while the class stores and serializes it; fix by making runtime
behavior consistent: update _forward_common to detect missing/None fparam and
substitute self.default_fparam (or a tensor/array converted from it) before
further processing, ensure the instance attribute default_fparam remains used
for serialization, and remove or update the "not supported" phrase in the
docstring of the class so it reflects actual runtime support; reference symbols:
default_fparam and _forward_common.
In `@deepmd/pt/loss/ener.py`:
- Around line 56-61: The new boolean flags use_default_pf and f_use_norm (and
any other loss-semantics flags like use_huber/huber_delta if relevant) must be
added to the object's persisted state so they survive serialize()/deserialize()
round-trips: update the class constructor parameters (the signature that
includes inference, use_huber, use_default_pf, f_use_norm, huber_delta) and the
serialize() method to include these flags in the returned state, and update
deserialize()/from_state() to read and restore them into the instance; ensure
the same keys are used in both serialize and deserialize so restored instances
preserve loss behavior (also mirror this change where similar state handling
occurs around the other block mentioned at lines ~137-153).
- Around line 306-320: The displayed MAE (more_loss["mae_f"]) is computed from
the L1 loss before you switch to the vector-norm objective (l1_force_loss) when
f_use_norm=True, and the atom-pref branch reuses the same key causing the
weighted value to overwrite the unweighted one; update the code so that when
self.f_use_norm is True you compute the vector-norm MAE first and pass that
value to self.display_if_exist (use the vector-norm result for
more_loss["mae_f"] rather than the original L1), and in the atom-preference
branch (has_pf handling around the same pattern at the 359-370 region) write the
weighted metric to a separate key (e.g. "mae_f_w" or similar) instead of reusing
"mae_f" so the unweighted and weighted values do not clobber each other; locate
the display/update calls around l1_force_loss, force_label/force_pred,
display_if_exist, self.f_use_norm and has_pf to apply these changes.
In `@deepmd/pt/loss/property.py`:
- Around line 95-104: The code in PropertyLoss.forward() currently computes
score_per_sample, weights, avg_vec and then overwrites model_pred[var_name] with
avg_vec.expand_as(...), which couples every sample in the batch and forces the
nbz==3 guard to be required; to fix, stop mutating model_pred[var_name] — either
remove the softmax-weighted replacement entirely so per-sample losses use the
original model_pred[var_name], or move the triplet/aggregated logic into a
dedicated loss path or an explicit flag/mode (e.g., a new aggregated_loss
branch) that computes the aggregated prediction (avg_vec) only for that mode and
computes loss against targets without expanding back into model_pred; also
remove or relocate the nbz==3 special-case (lines referencing the guard) so it
is only used by the dedicated aggregation path, not the generic
PropertyLoss.forward().
In `@deepmd/pt/model/atomic_model/dp_atomic_model.py`:
- Around line 239-261: The descriptor forward signatures are missing
fparam/case_embd and will raise TypeError when called from dp_atomic_model.py;
update each descriptor's forward method (DescrptDPA1.forward,
DescrptDPA2.forward, DescrptHybrid.forward, DescrptSeA.forward,
DescrptSeR.forward, DescrptSeT.forward, DescrptSeTTebd.forward) to accept the
new keyword arguments (either add explicit parameters fparam=None,
case_embd=None to the signature or include **kwargs) and ensure the
implementations ignore or pass them through as needed so the call
descriptor(..., fparam=fparam_input_for_des, case_embd=case_embd_input_for_des)
no longer raises.
In `@deepmd/pt/model/descriptor/dpa3.py`:
- Around line 204-218: The charge/spin embedding tables defined in the
add_chg_spin_ebd branch (chg_embedding and spin_embedding using TypeEmbedNet)
are too small for the documented ranges because indices are shifted (charge
+100, spin +0) so max values (charge=100 -> index 200, spin=100 -> index 100)
are out of range; fix by increasing the embedding sizes by one (use
TypeEmbedNet(201, ...) for chg_embedding and TypeEmbedNet(101, ...) for
spin_embedding) or adjust the indexing/offset logic so the max values map inside
bounds; apply the same fix to the other occurrence of these embeddings later in
the file (the repeated chg/spin embedding initialization).
- Around line 123-125: The new optional descriptor flags and modules
(add_chg_spin_ebd, add_case_embd, dim_case_embd and the trainable modules
chg_embedding, spin_embedding, mix_cs_mlp, case_embd_mlp) are not included in
the file's export/import (round-trip) logic; update the serialization and
deserialization code paths to persist and restore these fields: write
add_chg_spin_ebd, add_case_embd, dim_case_embd into the serialized config/state,
and when serializing module weights include the state_dicts for chg_embedding,
spin_embedding, mix_cs_mlp, and case_embd_mlp only when those flags are true;
conversely, in the loader/constructor detect those flags, instantiate the
corresponding modules (with correct dims) and load their saved state_dicts so
models enabling these options keep both config and trained parameters on
export-reload.
In `@deepmd/pt/model/network/mlp.py`:
- Around line 309-322: mlp_core and mlp_gate are created with the same seed,
causing identical weight initialization; change the MLPLayer calls to use
distinct seeds by deriving child seeds from the provided seed (e.g., call
child_seed(seed) twice to produce seed_core and seed_gate) and pass seed_core to
MLPLayer for mlp_core and seed_gate to MLPLayer for mlp_gate so the two paths
get independent RNG states and different initial weights.
In `@deepmd/pt/model/task/ener.py`:
- Around line 275-277: Remove the public registration for the non-serializable
fitting type so DPAtomicModel.serialize won't call into an unimplemented
serialize path: delete or comment out the `@Fitting.register`("ener_readout")
decorator on the EnergyFittingNetReadout class (and the analogous registration
for the related energy-readout fitting declared later), or replace the decorator
with a guarded registration that only runs once the class implements
serialize/deserialize; ensure references to DPAtomicModel.serialize and
self.fitting_net.serialize will only see registered fitting classes that support
round-trip serialization.
- Around line 328-350: The edge readout builds one FittingNet per type in the
constructor (edge_embed.networks) when mixed_types is False but forward() always
calls self.edge_embed.networks[0](edge_feature), collapsing all types to the
first network; change forward() to select the network per-edge using the type
index (the per-edge type variable used in forward, e.g., t or edge_type) and
call self.edge_embed.networks[type_idx](edge_feature) so each type uses its
corresponding FittingNet; apply the same fix for the other occurrences noted
(around the lines referenced, e.g., the second invocation at 404-406) and ensure
you still handle the mixed_types branch (which should keep using index 0).
- Around line 282-297: Change the mutable list default arguments in
EnergyFittingNetReadout.__init__: make the parameters neuron and norm_fact
default to None (not mutable lists) and inside __init__ set neuron = [128, 128,
128] if neuron is None else neuron and norm_fact = [120.0] if norm_fact is None
else norm_fact so each instance gets its own list; update the signature (neuron:
Optional[list[int]] = None, norm_fact: Optional[list[float]] = None) and any
type hints/usages referencing these parameters accordingly.
- Around line 333-348: The seed arithmetic fails when self.seed is None or a
list; fix by normalizing self.seed before calling child_seed: compute a local
base_seed = self.seed (if list, pick base_seed = self.seed[ii]); if
isinstance(base_seed, int) add 100 to it, otherwise leave None unchanged; then
call child_seed(base_seed, ii) when constructing edge_embed (NetworkCollection /
FittingNet) so child_seed only ever receives an int or None, not a list or an
expression that adds to None.
In `@deepmd/pt/train/training.py`:
- Around line 1165-1168: The conditional that checks find_fparam currently
assumes batch_data["find_fparam"] exists when "fparam" is present; update the
guard in the loop over input_keys (the block using item_key, input_keys,
batch_data, input_dict) to avoid KeyError by checking for the key first or using
a safe lookup (e.g., batch_data.get("find_fparam", 0.0)) so the condition
becomes: if item_key != "fparam" or batch_data.get("find_fparam", 0.0) != 0.0,
then assign input_dict[item_key] = batch_data[item_key].
In `@deepmd/pt/train/wrapper.py`:
- Around line 139-149: The computation of frac_prob may divide by zero when
model_key_prob_map[model_key_base] is zero; before computing frac_prob, check
the denominator (model_key_prob_map[model_key_base]) and handle the zero case
defensively (e.g., set frac_prob to 0 or a small epsilon, or log/raise a clear
exception) and then call link_class.share_params(base_class, shared_level_link,
model_prob=frac_prob, protection=data_stat_protect, resume=resume); ensure you
reference the same symbols model_key_prob_map, model_key_link, model_key_base,
frac_prob and link_class.share_params when making the change.
In `@deepmd/utils/argcheck.py`:
- Around line 2877-2907: The function fitting_ener_readout() is missing the
default_fparam argument; add an Argument("default_fparam", type=appropriate,
optional=True, default=None, doc=...) alongside numb_fparam/numb_aparam in
fitting_ener_readout() so its signature matches other fitting_* functions, and
then update the EnergyFittingNetReadout.__init__() to accept default_fparam and
pass it through to the parent InvarFitting (or to super().__init__(...,
default_fparam=default_fparam)) so the readout class forwards the value to the
base class.
---
Outside diff comments:
In `@deepmd/entrypoints/test.py`:
- Around line 705-712: In test_dos, avoid unconditionally reading
test_data["fparam"] when dp.get_dim_fparam() > 0; instead mirror test_ener's
logic and first check test_data["find_fparam"] != 0.0 (and/or
dp.has_default_fparam()) before slicing and passing fparam into the model.
Update the fparam assignment in test_dos so that fparam is set to None when
find_fparam is zero (or when the dp indicates default fparam should be used),
otherwise slice test_data["fparam"][:numb_test]; reference variables/functions:
test_dos, fparam, test_data["find_fparam"], dp.get_dim_fparam(),
dp.has_default_fparam().
- Around line 877-884: The snippet uses test_data["fparam"] without checking
whether fparam exists for this dataset; add the same guard used in other tests
by checking dp.find_fparam() (in addition to dp.get_dim_fparam() > 0) before
assigning fparam so that fparam = test_data["fparam"][:numb_test] only when
dp.find_fparam() is true, otherwise set fparam = None; keep the existing aparam
logic unchanged and update any downstream uses accordingly (refer to
dp.get_dim_fparam, dp.find_fparam, fparam, aparam, and test_data to locate the
code).
In `@deepmd/pd/model/task/invar_fitting.py`:
- Around line 87-134: Add an explicit default_fparam parameter to the Paddle
backend __init__ signature (e.g., default_fparam: Optional[list[float]] = None)
in deepmd/pd/model/task/invar_fitting.py and forward it into the parent
constructor call by adding default_fparam=default_fparam to the
super().__init__(...) invocation; this makes the signature consistent with the
PyTorch backend (the __init__ method and its super().__init__ call are the
unique identifiers to update).
In `@deepmd/pt/model/descriptor/repflow_layer.py`:
- Around line 1223-1263: The serialize() method of RepFlowLayer is missing the
new fields update_use_layernorm, use_gated_mlp, and gated_mlp_norm; update
serialize() to include these keys (e.g. "update_use_layernorm":
self.update_use_layernorm, "use_gated_mlp": self.use_gated_mlp,
"gated_mlp_norm": self.gated_mlp_norm) so deserialization preserves their
values, and bump the "@version" (e.g. increment to 3) to reflect the schema
change; ensure these attribute names match the constructor/instance attributes
used elsewhere in the class.
- Around line 1322-1338: Add proper serialization support and type-dispatch for
GatedMLP: implement serialize() and classmethod deserialize() on the GatedMLP
class so it returns a dict with a unique type tag and its params, and update the
deserialization sites that currently call MLPLayer.deserialize() (specifically
restoring node_edge_linear and edge_angle_linear1/edge_angle_linear2 and any
a_compress_* when use_gated_mlp=True) to inspect the saved type tag (or the flag
use_gated_mlp) and call GatedMLP.deserialize(...) when the tag/flag indicates a
gated MLP, otherwise call MLPLayer.deserialize(...); ensure the produced dict
format from GatedMLP.serialize matches what GatedMLP.deserialize expects so
round-trip works without raising AttributeError.
In `@deepmd/pt/model/descriptor/repflows.py`:
- Around line 258-269: Move the check that validates sel_reduce_factor (>0) to
before any use of it so dynamic_e_sel and dynamic_a_sel aren't computed when
sel_reduce_factor is zero; specifically, in the initializer validate
self.sel_reduce_factor and raise the ValueError if <=0 before computing
self.dynamic_e_sel = self.nnei / self.sel_reduce_factor and self.dynamic_a_sel =
self.a_sel / self.sel_reduce_factor, while keeping the existing check that if
self.use_dynamic_sel and not self.smooth_edge_update raise NotImplementedError.
---
Nitpick comments:
In `@deepmd/dpmodel/fitting/dos_fitting.py`:
- Around line 49-50: Update the type annotation for the parameter default_fparam
from Optional[list] to Optional[list[float]] in the function signature (the
constructor or initializer in dos_fitting.py that declares default_fparam) so it
matches other fitting classes (InvarFitting, PolarFitting, DipoleFitting); no
runtime changes required—just change the type hint to Optional[list[float]] to
improve precision and IDE/type-checker support.
In `@deepmd/dpmodel/fitting/ener_fitting.py`:
- Around line 49-50: Update the type annotation for the parameter default_fparam
to use Optional[list[float]] instead of Optional[list] for precise typing (match
InvarFitting); locate the function or constructor in ener_fitting.py that
declares default_fparam and change its signature to default_fparam:
Optional[list[float]] = None, and update any related type hints or docstrings in
the same module mentioning default_fparam to reflect list[float].
In `@deepmd/dpmodel/fitting/general_fitting.py`:
- Around line 186-193: The code allows a non-None default_fparam when
numb_fparam == 0 which will be ignored later; update validation in the
constructor (around default_fparam handling) to detect this mismatch and either
raise an assertion or emit a warning: if self.numb_fparam == 0 and
default_fparam is not None and len(self.default_fparam) > 0 then raise an
AssertionError (or log a warning via warnings.warn) indicating that
default_fparam will be ignored, referencing the symbols default_fparam,
numb_fparam, default_fparam_tensor and _call_common so reviewers can see the
relation.
In `@deepmd/dpmodel/utils/learning_rate.py`:
- Around line 72-73: Add a single blank line before the method definition "def
value(self, step) -> np.float64:" inside its class to comply with PEP8
method-spacing rules; locate the "value" method in
deepmd/dpmodel/utils/learning_rate.py and insert one blank line immediately
above that def so the method is separated from the preceding code block.
In `@deepmd/pd/model/task/fitting.py`:
- Around line 152-165: The constructor in deepmd/pd/model/task/fitting.py sets
default_fparam but lacks validation: add a check in __init__ to ensure that when
default_fparam is not None and numb_fparam is not None/defined,
len(default_fparam) == numb_fparam; if not, raise a ValueError with a clear
message referencing the mismatch. Mirror the validation logic used in
deepmd/dpmodel/fitting/general_fitting.py (lines around the existing check) so
cross-backend serialization/inference won't fail due to mismatched
default_fparam length.
In `@deepmd/pt/model/network/mlp.py`:
- Around line 350-358: The function find_normalization currently silently
returns None for unrecognized names; change it to validate the lowercase name
against the allowed keys ("batch", "layer", "none") and raise a descriptive
ValueError if the name is unknown (keeping the existing behavior of returning
None when name is None or "none"); reference the find_normalization function and
the mapping for "batch" -> nn.BatchNorm1d(dim) and "layer" -> nn.LayerNorm(dim)
when implementing this check and error message.
In `@deepmd/pt/model/task/dipole.py`:
- Around line 100-101: Change the broad type hint Optional[list] for the
default_fparam parameter to the more precise Optional[list[float]] in the dipole
task signature (the parameter named default_fparam in the function/constructor
in deepmd/pt/model/task/dipole.py); update any other occurrences of
default_fparam type annotations in that file to match and ensure typing imports
still provide Optional (no code logic changes needed).
In `@deepmd/pt/model/task/fitting.py`:
- Around line 746-751: The current bare assert in the block handling
self.numb_fparam > 0 (when fparam is None) can produce an unhelpful
AssertionError if self.default_fparam_tensor is also None; replace the assert
with an explicit check that raises a clear ValueError indicating that
default_fparam must be configured or fparam provided (reference symbols:
self.numb_fparam, fparam, self.default_fparam_tensor), then proceed to create
fparam by tiling self.default_fparam_tensor via unsqueeze and torch.tile as
before.
In `@deepmd/pt/model/task/invar_fitting.py`:
- Around line 179-195: The forward method currently accepts sw and edge_index
but never uses them; update the method to either consume or explicitly
acknowledge these parameters for interface compatibility by adding a short
inline comment or docstring note in forward() stating that sw and edge_index are
intentionally unused (kept for signature alignment with other fitting nets), and
ensure the implementation still calls self._forward_common(descriptor, atype,
gr, g2, h2, fparam, aparam) and returns {self.var_name:
out.to(env.GLOBAL_PT_FLOAT_PRECISION)} unchanged; reference symbols: forward(),
sw, edge_index, _forward_common, self.var_name, env.GLOBAL_PT_FLOAT_PRECISION.
In `@deepmd/pt/model/task/property.py`:
- Line 95: The parameter default_fparam in the function/class signature
currently uses Optional[list]; change its type annotation to the more specific
Optional[list[float]] to match InvarFitting's annotation (see invar_fitting.py)
so type checkers and readers know it holds floats; locate the default_fparam
declaration in deepmd/pt/model/task/property.py and update its annotation to
Optional[list[float]] accordingly.
In `@deepmd/utils/argcheck.py`:
- Around line 2522-2531: The learning_rate_wsd() function's Argument for
"decay_mode" lacks a doc string; update the Argument("decay_mode", ...) in
learning_rate_wsd to include a clear doc explaining the format and semantics
(e.g., a colon-separated percentage triplet "warmup:stable:decay" that sums to
100 and controls the fraction of training steps in each phase, with the default
"85:10:5" meaning 85% warmup, 10% stable, 5% decay), and mention that values are
interpreted as percentages of total training steps and should be integers
summing to 100.
- Around line 1686-1703: Add missing documentation for the three new repflow
arguments by supplying a descriptive doc= string for each Argument call:
update_use_layernorm, use_gated_mlp, and gated_mlp_norm in the Argument block.
Follow the pattern used by other arguments in this function (brief one-line
description, expected behavior/defaults and valid values for gated_mlp_norm like
"none"|"pre"|"post" if applicable) so users can understand when to enable
layernorm, what gated MLP toggles, and accepted values for gated_mlp_norm.
- Around line 2701-2754: The new loss arguments use_default_pf, f_use_norm, and
use_l1_all in the loss_ener() argument list are missing doc strings; add
descriptive doc parameters for each following the existing pattern (e.g., define
doc_use_default_pf, doc_f_use_norm, doc_use_l1_all near other doc_* variables
and pass them into the corresponding Argument(...) calls) so each
Argument("use_default_pf", ...), Argument("f_use_norm", ...), and
Argument("use_l1_all", ...) includes a doc=... parameter that clearly explains
the option's behavior and defaults.
- Around line 2958-2970: The two Argument entries for "add_edge_readout" and
"slim_edge_readout" are missing their doc strings; update the Argument(...)
calls for add_edge_readout and slim_edge_readout to include a doc= parameter
that briefly explains what each flag controls (e.g., whether to include an
edge-level readout in the model and whether to use a slim/compact edge readout),
and mention their default behavior (True for add_edge_readout, False for
slim_edge_readout) to match other arguments' style and clarity.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e9a98c2e-8bc9-4fb9-ad37-7f0821389d86
📒 Files selected for processing (49)
deepmd/dpmodel/atomic_model/base_atomic_model.pydeepmd/dpmodel/atomic_model/dp_atomic_model.pydeepmd/dpmodel/descriptor/dpa3.pydeepmd/dpmodel/descriptor/make_base_descriptor.pydeepmd/dpmodel/fitting/dipole_fitting.pydeepmd/dpmodel/fitting/dos_fitting.pydeepmd/dpmodel/fitting/ener_fitting.pydeepmd/dpmodel/fitting/general_fitting.pydeepmd/dpmodel/fitting/invar_fitting.pydeepmd/dpmodel/fitting/make_base_fitting.pydeepmd/dpmodel/fitting/polarizability_fitting.pydeepmd/dpmodel/fitting/property_fitting.pydeepmd/dpmodel/infer/deep_eval.pydeepmd/dpmodel/model/make_model.pydeepmd/dpmodel/utils/learning_rate.pydeepmd/entrypoints/test.pydeepmd/infer/deep_eval.pydeepmd/infer/deep_property.pydeepmd/jax/fitting/fitting.pydeepmd/pd/model/task/ener.pydeepmd/pd/model/task/fitting.pydeepmd/pd/model/task/invar_fitting.pydeepmd/pt/infer/deep_eval.pydeepmd/pt/loss/ener.pydeepmd/pt/loss/property.pydeepmd/pt/model/atomic_model/base_atomic_model.pydeepmd/pt/model/atomic_model/dp_atomic_model.pydeepmd/pt/model/descriptor/dpa1.pydeepmd/pt/model/descriptor/dpa3.pydeepmd/pt/model/descriptor/repflow_layer.pydeepmd/pt/model/descriptor/repflows.pydeepmd/pt/model/model/__init__.pydeepmd/pt/model/model/make_model.pydeepmd/pt/model/network/mlp.pydeepmd/pt/model/task/dipole.pydeepmd/pt/model/task/dos.pydeepmd/pt/model/task/ener.pydeepmd/pt/model/task/fitting.pydeepmd/pt/model/task/invar_fitting.pydeepmd/pt/model/task/polarizability.pydeepmd/pt/model/task/property.pydeepmd/pt/train/training.pydeepmd/pt/train/wrapper.pydeepmd/pt/utils/learning_rate.pydeepmd/pt/utils/stat.pydeepmd/utils/argcheck.pydeepmd/utils/env_mat_stat.pydeepmd/utils/path.pysource/tests/pt/test_dp_test.py
| update_use_layernorm: bool = False, | ||
| use_gated_mlp: bool = False, | ||
| gated_mlp_norm: str = "none", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if DescrptBlockRepflows accepts these parameters
ast-grep --pattern $'class DescrptBlockRepflows {
$$$
def __init__($$$) {
$$$
}
$$$
}'
# Alternative: search for the constructor signature
rg -n "def __init__" deepmd/dpmodel/descriptor/repflows.py -A 30 | head -50Repository: deepmodeling/deepmd-kit
Length of output: 1725
🏁 Script executed:
# Get the full RepFlowArgs class definition including __init__, serialize, and DescrptBlockRepflows call
sed -n '170,370p' deepmd/dpmodel/descriptor/dpa3.pyRepository: deepmodeling/deepmd-kit
Length of output: 8519
🏁 Script executed:
# Get the full DescrptBlockRepflows.__init__ signature
rg -n "class DescrptBlockRepflows" deepmd/dpmodel/descriptor/repflows.py -A 200 | head -100Repository: deepmodeling/deepmd-kit
Length of output: 6378
🏁 Script executed:
# Get the complete __init__ signature of DescrptBlockRepflows
sed -n '172,250p' deepmd/dpmodel/descriptor/repflows.pyRepository: deepmodeling/deepmd-kit
Length of output: 3029
Missing serialization and forwarding of three new RepFlowArgs parameters.
The three new parameters (update_use_layernorm, use_gated_mlp, gated_mlp_norm) are stored as instance attributes but:
-
Not serialized:
RepFlowArgs.serialize()omits these parameters. Models will lose them when saved/loaded. -
Not forwarded: These parameters are not passed to
DescrptBlockRepflows, andDescrptBlockRepflowsdoesn't accept them in its constructor. This indicates incomplete implementation—either they should be added toDescrptBlockRepflowsor they should not be inRepFlowArgsyet.
🔧 Proposed fix for serialization
Add the missing parameters to serialize():
"use_dynamic_sel": self.use_dynamic_sel,
"sel_reduce_factor": self.sel_reduce_factor,
+ "update_use_layernorm": self.update_use_layernorm,
+ "use_gated_mlp": self.use_gated_mlp,
+ "gated_mlp_norm": self.gated_mlp_norm,
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@deepmd/dpmodel/descriptor/dpa3.py` around lines 180 - 182, RepFlowArgs now
has three new fields (update_use_layernorm, use_gated_mlp, gated_mlp_norm) but
they are not included in RepFlowArgs.serialize() nor forwarded to
DescrptBlockRepflows; update RepFlowArgs.serialize() to include these three keys
so they persist, and then modify the call site where a RepFlowArgs instance is
passed to construct DescrptBlockRepflows to forward these values (or update
DescrptBlockRepflows.__init__ to accept and store them), ensuring the
constructor/attribute names match (update_use_layernorm, use_gated_mlp,
gated_mlp_norm) so saved models round-trip correctly.
| decay_rate = (self.start_lr - self.stop_lr) / ( | ||
| self.decay_end_rate * self.stop_steps | ||
| - self.decay_start_rate * self.stop_steps | ||
| ) | ||
| return self.start_lr - decay_rate * ( | ||
| step - self.decay_start_rate * self.stop_steps | ||
| ) No newline at end of file |
There was a problem hiding this comment.
Potential division by zero when decay phase duration is zero.
If decay_mode[1] == 0 (e.g., "90:0:10"), then decay_end_rate == decay_start_rate, causing division by zero at line 80-83. The boundary checks at lines 74-77 might not catch all steps in this edge case.
🛡️ Proposed fix to handle zero-duration decay
def value(self, step) -> np.float64:
if step < self.decay_start_rate * self.stop_steps:
return self.start_lr
elif step >= self.decay_end_rate * self.stop_steps:
return self.stop_lr
+ elif self.decay_start_rate == self.decay_end_rate:
+ # No decay phase, should not reach here but handle gracefully
+ return self.start_lr
else:
# linear decay
decay_rate = (self.start_lr - self.stop_lr) / (
self.decay_end_rate * self.stop_steps
- self.decay_start_rate * self.stop_steps
)
return self.start_lr - decay_rate * (
step - self.decay_start_rate * self.stop_steps
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@deepmd/dpmodel/utils/learning_rate.py` around lines 80 - 86, Detect the
zero-duration decay case before computing decay_rate: if self.decay_end_rate ==
self.decay_start_rate (i.e., decay phase length is zero) avoid the division and
return an instantaneous value—return self.start_lr when step <=
self.decay_start_rate * self.stop_steps else return self.stop_lr; update the
block that computes decay_rate and the return to short-circuit on this equality
to prevent division by zero in decay_rate and ensure correct boundary behavior
for step.
| # --- softmax-weighted averaging over frames (minimal) --- | ||
| print(f"Nframes == {nframes}") | ||
| if nframes != 3: | ||
| raise RuntimeError(f"Expected nframes == 3, got {nframes}") | ||
| scores = property.mean(axis=1) # (3,) | ||
| # If you want to favor *smaller* values (e.g., energies), use: scores = -scores | ||
| w = np.exp(scores - scores.max()); w /= w.sum() # (3,) | ||
| avg = (w[:, None] * property).sum(axis=0, keepdims=True) # (1, D) | ||
| property[:] = np.repeat(avg, nframes, axis=0) # (3, D) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
ruff check deepmd/infer/deep_property.py --select F823,E702 || true
nl -ba deepmd/infer/deep_property.py | sed -n '139,155p'Repository: deepmodeling/deepmd-kit
Length of output: 1517
🏁 Script executed:
cat -n deepmd/infer/deep_property.py | sed -n '139,155p'Repository: deepmodeling/deepmd-kit
Length of output: 1077
🏁 Script executed:
cat -n deepmd/infer/deep_property.py | sed -n '130,165p'Repository: deepmodeling/deepmd-kit
Length of output: 1613
This block currently makes DeepProperty.eval() fail for every caller.
For non-3-frame inputs, line 145 always raises; for 3-frame inputs, line 146 attempts to read property before its first assignment at line 153, causing a NameError. Even if that reference were fixed, lines 153-155 would immediately overwrite any weighted result computed here. Remove this entire block or restructure to compute property once before the weighting logic. The code also violates Ruff checks (F823: local variable referenced before assignment at line 146; E702: multiple statements on one line at line 148). Per coding guidelines, always run ruff check . and ruff format . before committing changes.
🛠️ Minimal fix
- # --- softmax-weighted averaging over frames (minimal) ---
- print(f"Nframes == {nframes}")
- if nframes != 3:
- raise RuntimeError(f"Expected nframes == 3, got {nframes}")
- scores = property.mean(axis=1) # (3,)
- # If you want to favor *smaller* values (e.g., energies), use: scores = -scores
- w = np.exp(scores - scores.max()); w /= w.sum() # (3,)
- avg = (w[:, None] * property).sum(axis=0, keepdims=True) # (1, D)
- property[:] = np.repeat(avg, nframes, axis=0) # (3, D)
- # --------------------------------------------------------
-
property = results[f"{self.get_var_name()}_redu"].reshape(
nframes, self.get_task_dim()
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # --- softmax-weighted averaging over frames (minimal) --- | |
| print(f"Nframes == {nframes}") | |
| if nframes != 3: | |
| raise RuntimeError(f"Expected nframes == 3, got {nframes}") | |
| scores = property.mean(axis=1) # (3,) | |
| # If you want to favor *smaller* values (e.g., energies), use: scores = -scores | |
| w = np.exp(scores - scores.max()); w /= w.sum() # (3,) | |
| avg = (w[:, None] * property).sum(axis=0, keepdims=True) # (1, D) | |
| property[:] = np.repeat(avg, nframes, axis=0) # (3, D) | |
| property = results[f"{self.get_var_name()}_redu"].reshape( | |
| nframes, self.get_task_dim() | |
| ) |
🧰 Tools
🪛 Ruff (0.15.6)
[error] 146-146: Local variable property referenced before assignment
(F823)
[error] 148-148: Multiple statements on one line (semicolon)
(E702)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@deepmd/infer/deep_property.py` around lines 142 - 150, The softmax-weighted
averaging block in DeepProperty.eval() wrongly assumes nframes==3, references
the local name property before it's assigned, and uses multiple statements per
line; remove or refactor this block so property is computed exactly once and
then weighted: compute the per-frame values into a new local (e.g., prop_vals)
before any conditional, replace the nframes check with a guard that handles
arbitrary frame counts (or early-return if unsupported), perform the softmax
weighting using separate statements (no nondescript inline semicolons) and
assign the final weighted result back to property once (avoiding overwriting
earlier results), and then run ruff check/format to remove F823/E702 violations;
look for DeepProperty.eval, the variables nframes, scores, w, avg and property
in the shown diff to locate the code to change.
| default_fparam: list[float], optional | ||
| The default frame parameter. If set, when `fparam.npy` files are not included in the data system, | ||
| this value will be used as the default value for the frame parameter in the fitting net. | ||
| This parameter is not supported in PaddlePaddle. | ||
| dim_case_embd : int |
There was a problem hiding this comment.
Docstring claims feature is "not supported" but code stores and serializes it.
The docstring states "This parameter is not supported in PaddlePaddle" yet the code stores default_fparam on the instance and includes it in serialization. This is inconsistent:
- If truly unsupported, why store/serialize it? This could confuse users.
- If intended for serialization compatibility only, the docstring should clarify this.
- Unlike
deepmd/dpmodel/fitting/general_fitting.py, this implementation lacks runtime usage in_forward_commonto fill missing fparam.
Consider either:
- Implementing full support (using default when fparam is None in
_forward_common) - Clarifying the docstring to explain this is for serialization interoperability only
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@deepmd/pd/model/task/fitting.py` around lines 98 - 102, The docstring
incorrectly says default_fparam is "not supported" while the class stores and
serializes it; fix by making runtime behavior consistent: update _forward_common
to detect missing/None fparam and substitute self.default_fparam (or a
tensor/array converted from it) before further processing, ensure the instance
attribute default_fparam remains used for serialization, and remove or update
the "not supported" phrase in the docstring of the class so it reflects actual
runtime support; reference symbols: default_fparam and _forward_common.
| inference: bool = False, | ||
| use_huber: bool = False, | ||
| use_default_pf: bool = False, | ||
| f_use_norm: bool = False, | ||
| huber_delta: float = 0.01, | ||
| **kwargs: Any, |
There was a problem hiding this comment.
The new loss flags also need to round-trip through serialization.
use_default_pf and f_use_norm now change loss semantics, but they are not part of the persisted loss state. Any round-trip through serialize() / deserialize() will silently fall back to False and change behavior.
Also applies to: 137-153
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@deepmd/pt/loss/ener.py` around lines 56 - 61, The new boolean flags
use_default_pf and f_use_norm (and any other loss-semantics flags like
use_huber/huber_delta if relevant) must be added to the object's persisted state
so they survive serialize()/deserialize() round-trips: update the class
constructor parameters (the signature that includes inference, use_huber,
use_default_pf, f_use_norm, huber_delta) and the serialize() method to include
these flags in the returned state, and update deserialize()/from_state() to read
and restore them into the instance; ensure the same keys are used in both
serialize and deserialize so restored instances preserve loss behavior (also
mirror this change where similar state handling occurs around the other block
mentioned at lines ~137-153).
| if self.add_edge_readout: | ||
| self.edge_embed = NetworkCollection( | ||
| 1 if not self.mixed_types else 0, | ||
| self.ntypes, | ||
| network_type="fitting_network", | ||
| networks=[ | ||
| FittingNet( | ||
| self.embedding_width, | ||
| 1, | ||
| self.neuron if not self.slim_edge_readout else self.neuron[:1], | ||
| self.activation_function, | ||
| self.resnet_dt, | ||
| self.precision, | ||
| bias_out=True, | ||
| seed=child_seed(self.seed + 100, ii), | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "ener.py" | grep -E "deepmd/pt/model/task"Repository: deepmodeling/deepmd-kit
Length of output: 97
🏁 Script executed:
cd deepmd/pt/model/task && wc -l ener.pyRepository: deepmodeling/deepmd-kit
Length of output: 78
🏁 Script executed:
head -400 deepmd/pt/model/task/ener.py | tail -100Repository: deepmodeling/deepmd-kit
Length of output: 3436
🏁 Script executed:
rg "def __init__" deepmd/pt/model/task/ener.py -A 50 | head -80Repository: deepmodeling/deepmd-kit
Length of output: 2522
🏁 Script executed:
rg "self.seed" deepmd/pt/model/task/ener.py -B 2 -A 2Repository: deepmodeling/deepmd-kit
Length of output: 309
🏁 Script executed:
rg "def child_seed" deepmd/pt/model/task/ener.py -B 2 -A 10Repository: deepmodeling/deepmd-kit
Length of output: 49
🏁 Script executed:
rg "def child_seed" . --type py -A 10Repository: deepmodeling/deepmd-kit
Length of output: 1020
🏁 Script executed:
cat deepmd/dpmodel/utils/seed.pyRepository: deepmodeling/deepmd-kit
Length of output: 987
🏁 Script executed:
rg "class EnergyFittingNetReadout" deepmd/pt/model/task/ener.py -A 100 | head -120Repository: deepmodeling/deepmd-kit
Length of output: 3407
🏁 Script executed:
rg "self\.seed" deepmd/pt/model/task/ener.py -B 3 -A 3Repository: deepmodeling/deepmd-kit
Length of output: 368
🏁 Script executed:
rg "class InvarFitting" deepmd/pt/model/ --type py -A 5Repository: deepmodeling/deepmd-kit
Length of output: 423
🏁 Script executed:
fd "invariant.py" deepmd/pt/model/Repository: deepmodeling/deepmd-kit
Length of output: 49
🏁 Script executed:
rg "self.seed" deepmd/pt/model/invariant.py -B 2 -A 2Repository: deepmodeling/deepmd-kit
Length of output: 135
🏁 Script executed:
cat deepmd/pt/model/task/invar_fitting.py | head -150Repository: deepmodeling/deepmd-kit
Length of output: 4910
🏁 Script executed:
rg "self\.seed\s*=" deepmd/pt/model/ --type py -B 2 -A 2Repository: deepmodeling/deepmd-kit
Length of output: 5294
🏁 Script executed:
rg "class GeneralFitting" deepmd/pt/model/ --type py -A 50 | head -80Repository: deepmodeling/deepmd-kit
Length of output: 3873
🏁 Script executed:
cd deepmd/pt/model/task && python -c "
# Test what happens with seed=None
seed = None
try:
result = seed + 100
print(f'seed + 100 = {result}')
except TypeError as e:
print(f'seed=None: TypeError - {e}')
# Test what happens with seed=[1, 2]
seed = [1, 2]
try:
result = seed + 100
print(f'seed + 100 = {result}')
except TypeError as e:
print(f'seed=list: TypeError - {e}')
# Test what happens with seed=42
seed = 42
try:
result = seed + 100
print(f'seed + 100 = {result}')
except TypeError as e:
print(f'seed=int: TypeError - {e}')
"Repository: deepmodeling/deepmd-kit
Length of output: 231
🏁 Script executed:
cd deepmd && python -c "
from dpmodel.utils.seed import child_seed
# Test the suggested fix approach
print('Testing suggested fix: child_seed(child_seed(seed, 100), ii)')
print()
# Case 1: seed=None
seed = None
print(f'Case 1: seed=None')
first_call = child_seed(seed, 100)
print(f' child_seed(None, 100) = {first_call}')
second_call = child_seed(first_call, 0)
print(f' child_seed({first_call}, 0) = {second_call}')
print()
# Case 2: seed=42 (int)
seed = 42
print(f'Case 2: seed=42')
first_call = child_seed(seed, 100)
print(f' child_seed(42, 100) = {first_call}')
second_call = child_seed(first_call, 0)
print(f' child_seed({first_call}, 0) = {second_call}')
print()
# Case 3: seed=[1, 2] (list)
seed = [1, 2]
print(f'Case 3: seed=[1, 2]')
first_call = child_seed(seed, 100)
print(f' child_seed([1, 2], 100) = {first_call}')
second_call = child_seed(first_call, 0)
print(f' child_seed({first_call}, 0) = {second_call}')
"Repository: deepmodeling/deepmd-kit
Length of output: 305
🏁 Script executed:
cd deepmd && ruff check pt/model/task/ener.py --select=E,W,FRepository: deepmodeling/deepmd-kit
Length of output: 805
🏁 Script executed:
head -50 deepmd/pt/model/task/ener.py | grep -E "^import|^from"Repository: deepmodeling/deepmd-kit
Length of output: 526
🏁 Script executed:
grep "child_seed" deepmd/pt/model/task/ener.pyRepository: deepmodeling/deepmd-kit
Length of output: 144
Fix the seed handling to support None and list-based seeds.
The code seed=child_seed(self.seed + 100, ii) fails for the default seed=None and for seed as a list, raising TypeError before the edge readout networks are constructed. The public signature allows seed: Optional[Union[int, list[int]]] = None, but only the integer case works with the current implementation.
Suggested fix
- seed=child_seed(self.seed + 100, ii),
+ seed=child_seed(child_seed(self.seed, 100), ii),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@deepmd/pt/model/task/ener.py` around lines 333 - 348, The seed arithmetic
fails when self.seed is None or a list; fix by normalizing self.seed before
calling child_seed: compute a local base_seed = self.seed (if list, pick
base_seed = self.seed[ii]); if isinstance(base_seed, int) add 100 to it,
otherwise leave None unchanged; then call child_seed(base_seed, ii) when
constructing edge_embed (NetworkCollection / FittingNet) so child_seed only ever
receives an int or None, not a list or an expression that adds to None.
| old_model_params = self.wrapper.state_dict()["_extra_state"][ | ||
| "model_params" | ||
| ] | ||
| try: | ||
| self.wrapper.load_state_dict(state_dict) | ||
| except RuntimeError as e: | ||
| # init from direct fitting | ||
| rm_list = [] | ||
| for kk in state_dict: | ||
| # delete direct heads | ||
| if ( | ||
| "fitting_net.force_embed." in kk | ||
| or "fitting_net.noise_embed" in kk | ||
| ): | ||
| rm_list.append(kk) | ||
| for kk in rm_list: | ||
| state_dict.pop(kk) | ||
| state_dict["_extra_state"]["model_params"] = old_model_params | ||
| out_shape_list = [ | ||
| "model.Default.atomic_model.out_bias", | ||
| "model.Default.atomic_model.out_std", | ||
| ] | ||
| for kk in out_shape_list: | ||
| state_dict[kk] = state_dict[kk][:1, :, :1] | ||
| self.wrapper.load_state_dict(state_dict) |
There was a problem hiding this comment.
The checkpoint fallback is too shape-specific.
This recovery path catches any load_state_dict() RuntimeError, then hard-codes model.Default.atomic_model.out_bias/out_std and truncates them to [:1, :, :1]. That only matches one branch/output shape; multitask checkpoints or multi-output heads will either miss the right keys or silently load corrupted statistics.
| for item_key in batch_data: | ||
| if item_key in input_keys: | ||
| input_dict[item_key] = batch_data[item_key] | ||
| if item_key != "fparam" or batch_data["find_fparam"] != 0.0: | ||
| input_dict[item_key] = batch_data[item_key] |
There was a problem hiding this comment.
Guard find_fparam lookups for batches that only carry fparam.
This new gate unconditionally indexes batch_data["find_fparam"] whenever fparam is present. Other paths already treat find_fparam as optional, so loaders that only emit required fparam values can now hit a KeyError.
Suggested fix
if item_key in input_keys:
- if item_key != "fparam" or batch_data["find_fparam"] != 0.0:
+ if item_key != "fparam" or batch_data.get("find_fparam", 1.0) != 0.0:
input_dict[item_key] = batch_data[item_key]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for item_key in batch_data: | |
| if item_key in input_keys: | |
| input_dict[item_key] = batch_data[item_key] | |
| if item_key != "fparam" or batch_data["find_fparam"] != 0.0: | |
| input_dict[item_key] = batch_data[item_key] | |
| for item_key in batch_data: | |
| if item_key in input_keys: | |
| if item_key != "fparam" or batch_data.get("find_fparam", 1.0) != 0.0: | |
| input_dict[item_key] = batch_data[item_key] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@deepmd/pt/train/training.py` around lines 1165 - 1168, The conditional that
checks find_fparam currently assumes batch_data["find_fparam"] exists when
"fparam" is present; update the guard in the loop over input_keys (the block
using item_key, input_keys, batch_data, input_dict) to avoid KeyError by
checking for the key first or using a safe lookup (e.g.,
batch_data.get("find_fparam", 0.0)) so the condition becomes: if item_key !=
"fparam" or batch_data.get("find_fparam", 0.0) != 0.0, then assign
input_dict[item_key] = batch_data[item_key].
| frac_prob = ( | ||
| model_key_prob_map[model_key_link] | ||
| / model_key_prob_map[model_key_base] | ||
| ) | ||
| link_class.share_params( | ||
| base_class, shared_level_link, resume=resume | ||
| base_class, | ||
| shared_level_link, | ||
| model_prob=frac_prob, | ||
| protection=data_stat_protect, | ||
| resume=resume, | ||
| ) |
There was a problem hiding this comment.
Potential division by zero when computing frac_prob.
If model_key_prob_map[model_key_base] is zero, this will raise a ZeroDivisionError. While the base model probability is likely always positive in practice, defensive coding would prevent unexpected failures.
🛡️ Proposed fix to add protection
frac_prob = (
model_key_prob_map[model_key_link]
/ model_key_prob_map[model_key_base]
)
+ if model_key_prob_map[model_key_base] == 0:
+ raise ValueError(
+ f"Base model '{model_key_base}' has zero probability in model_key_prob_map"
+ )Or preemptively check before the division:
+ base_prob = model_key_prob_map[model_key_base]
+ if base_prob == 0:
+ raise ValueError(
+ f"Base model '{model_key_base}' has zero probability"
+ )
frac_prob = (
model_key_prob_map[model_key_link]
- / model_key_prob_map[model_key_base]
+ / base_prob
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| frac_prob = ( | |
| model_key_prob_map[model_key_link] | |
| / model_key_prob_map[model_key_base] | |
| ) | |
| link_class.share_params( | |
| base_class, shared_level_link, resume=resume | |
| base_class, | |
| shared_level_link, | |
| model_prob=frac_prob, | |
| protection=data_stat_protect, | |
| resume=resume, | |
| ) | |
| base_prob = model_key_prob_map[model_key_base] | |
| if base_prob == 0: | |
| raise ValueError( | |
| f"Base model '{model_key_base}' has zero probability" | |
| ) | |
| frac_prob = ( | |
| model_key_prob_map[model_key_link] | |
| / base_prob | |
| ) | |
| link_class.share_params( | |
| base_class, | |
| shared_level_link, | |
| model_prob=frac_prob, | |
| protection=data_stat_protect, | |
| resume=resume, | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@deepmd/pt/train/wrapper.py` around lines 139 - 149, The computation of
frac_prob may divide by zero when model_key_prob_map[model_key_base] is zero;
before computing frac_prob, check the denominator
(model_key_prob_map[model_key_base]) and handle the zero case defensively (e.g.,
set frac_prob to 0 or a small epsilon, or log/raise a clear exception) and then
call link_class.share_params(base_class, shared_level_link,
model_prob=frac_prob, protection=data_stat_protect, resume=resume); ensure you
reference the same symbols model_key_prob_map, model_key_link, model_key_base,
frac_prob and link_class.share_params when making the change.
| @fitting_args_plugin.register("ener_readout") | ||
| def fitting_ener_readout(): | ||
| doc_numb_fparam = "The dimension of the frame parameter. If set to >0, file `fparam.npy` should be included to provided the input fparams." | ||
| doc_numb_aparam = "The dimension of the atomic parameter. If set to >0, file `aparam.npy` should be included to provided the input aparams." | ||
| doc_dim_case_embd = "The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches." | ||
| doc_neuron = "The number of neurons in each hidden layers of the fitting net. When two hidden layers are of the same size, a skip connection is built." | ||
| doc_activation_function = f'The activation function in the fitting net. Supported activation functions are {list_to_doc(ACTIVATION_FN_DICT.keys())} Note that "gelu" denotes the custom operator version, and "gelu_tf" denotes the TF standard version. If you set "None" or "none" here, no activation function will be used.' | ||
| doc_precision = f"The precision of the fitting net parameters, supported options are {list_to_doc(PRECISION_DICT.keys())} Default follows the interface precision." | ||
| doc_resnet_dt = 'Whether to use a "Timestep" in the skip connection' | ||
| doc_trainable = f"Whether the parameters in the fitting net are trainable. This option can be\n\n\ | ||
| - bool: True if all parameters of the fitting net are trainable, False otherwise.\n\n\ | ||
| - list of bool{doc_only_tf_supported}: Specifies if each layer is trainable. Since the fitting net is composed by hidden layers followed by a output layer, the length of this list should be equal to len(`neuron`)+1." | ||
| doc_rcond = "The condition number used to determine the initial energy shift for each type of atoms. See `rcond` in :py:meth:`numpy.linalg.lstsq` for more details." | ||
| doc_seed = "Random seed for parameter initialization of the fitting net" | ||
| doc_atom_ener = "Specify the atomic energy in vacuum for each type" | ||
| doc_layer_name = ( | ||
| "The name of the each layer. The length of this list should be equal to n_neuron + 1. " | ||
| "If two layers, either in the same fitting or different fittings, " | ||
| "have the same name, they will share the same neural network parameters. " | ||
| "The shape of these layers should be the same. " | ||
| "If null is given for a layer, parameters will not be shared." | ||
| ) | ||
| doc_use_aparam_as_mask = ( | ||
| "Whether to use the aparam as a mask in input." | ||
| "If True, the aparam will not be used in fitting net for embedding." | ||
| "When descrpt is se_a_mask, the aparam will be used as a mask to indicate the input atom is real/virtual. And use_aparam_as_mask should be set to True." | ||
| ) | ||
|
|
||
| return [ | ||
| Argument("numb_fparam", int, optional=True, default=0, doc=doc_numb_fparam), | ||
| Argument("numb_aparam", int, optional=True, default=0, doc=doc_numb_aparam), |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if EnergyFittingNetReadout implementation supports fparam/default_fparam
rg -n "class.*EnergyFittingNetReadout|default_fparam" --type py -A 5 -B 2Repository: deepmodeling/deepmd-kit
Length of output: 50379
🏁 Script executed:
#!/bin/bash
# View the complete fitting_ener_readout function to see all arguments
sed -n '2877,2950p' deepmd/utils/argcheck.pyRepository: deepmodeling/deepmd-kit
Length of output: 4348
🏁 Script executed:
#!/bin/bash
# View lines beyond 2950 to see if there are more arguments
sed -n '2940,3000p' deepmd/utils/argcheck.pyRepository: deepmodeling/deepmd-kit
Length of output: 1974
🏁 Script executed:
#!/bin/bash
# View the EnergyFittingNetReadout class definition and its __init__ method
sed -n '275,330p' deepmd/pt/model/task/ener.pyRepository: deepmodeling/deepmd-kit
Length of output: 1961
🏁 Script executed:
#!/bin/bash
# Search for InvarFitting class definition and its __init__
rg -n "class InvarFitting" deepmd/pt/model/task/ -A 50 | head -80Repository: deepmodeling/deepmd-kit
Length of output: 4279
🏁 Script executed:
#!/bin/bash
# View the InvarFitting __init__ method signature
sed -n '88,140p' deepmd/pt/model/task/invar_fitting.pyRepository: deepmodeling/deepmd-kit
Length of output: 1891
Add default_fparam argument to fitting_ener_readout() for consistency.
The fitting_ener_readout() function is missing the default_fparam argument that is present in all other fitting functions (fitting_ener, fitting_dos, fitting_property, fitting_polar, fitting_dipole). Since EnergyFittingNetReadout extends InvarFitting, which supports default_fparam, and the function already defines numb_fparam, the default_fparam argument should be added to maintain consistency.
Additionally, the EnergyFittingNetReadout.__init__() signature needs to include the default_fparam parameter and pass it to the parent class.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@deepmd/utils/argcheck.py` around lines 2877 - 2907, The function
fitting_ener_readout() is missing the default_fparam argument; add an
Argument("default_fparam", type=appropriate, optional=True, default=None,
doc=...) alongside numb_fparam/numb_aparam in fitting_ener_readout() so its
signature matches other fitting_* functions, and then update the
EnergyFittingNetReadout.__init__() to accept default_fparam and pass it through
to the parent InvarFitting (or to super().__init__(...,
default_fparam=default_fparam)) so the readout class forwards the value to the
base class.
Summary by CodeRabbit
Release Notes
New Features
Improvements
Bug Fixes