feat(pt_expt): add dp change-bias support#5330
feat(pt_expt): add dp change-bias support#5330wanghan-iapcm merged 12 commits intodeepmodeling:masterfrom
Conversation
Support adjusting output bias for pt_expt models via `dp --pt-expt change-bias`, handling both .pt checkpoints and .pte frozen models. Adds model_change_out_bias helper in training.py and CLI end-to-end tests covering data-based, file-based, user-defined, and frozen model bias changes.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 303282e489
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a CLI command and implementation to update pt_expt model output biases for Changes
Sequence DiagramsequenceDiagram
participant User
participant CLI as "CLI Entry Point"
participant Loader as "Model Loader"
participant Model as "Model Instance"
participant DataSys as "DeepmdDataSystem"
participant BiasAdj as "Bias Adjuster"
participant Serializer as "Output Serializer"
User->>CLI: invoke change-bias
CLI->>Loader: load input (.pt or .pte/.pt2)
Loader->>Model: deserialize / reconstruct model
alt direct bias_value provided
CLI->>BiasAdj: provide bias_value
BiasAdj->>Model: set output bias
else statistics-based
CLI->>DataSys: build from datafile/system
DataSys->>DataSys: sample statistic inputs
DataSys->>BiasAdj: send samples
BiasAdj->>Model: change_out_bias with samples
end
Model->>Serializer: serialize updated model to output
Serializer->>User: save updated file
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
deepmd/pt_expt/entrypoints/main.py (1)
345-348: Consider specifying file encoding explicitly.Opening the file without specifying encoding relies on the system default, which may vary across platforms. For consistency, consider using
encoding="utf-8".♻️ Suggested fix
if datafile is not None: - with open(datafile) as datalist: + with open(datafile, encoding="utf-8") as datalist: all_sys = datalist.read().splitlines()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/entrypoints/main.py` around lines 345 - 348, The file is opened without an explicit encoding which can vary by platform; update the block that reads the datafile (the branch that checks datafile is not None and uses the variable datalist to populate all_sys) to open the file with encoding="utf-8" (i.e., use open(datafile, encoding="utf-8") when creating datalist) so the read and splitlines() behavior is consistent across environments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@deepmd/pt_expt/entrypoints/main.py`:
- Around line 345-348: The file is opened without an explicit encoding which can
vary by platform; update the block that reads the datafile (the branch that
checks datafile is not None and uses the variable datalist to populate all_sys)
to open the file with encoding="utf-8" (i.e., use open(datafile,
encoding="utf-8") when creating datalist) so the read and splitlines() behavior
is consistent across environments.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e324a695-988c-40dc-830f-2ad343e88198
📒 Files selected for processing (3)
deepmd/pt_expt/entrypoints/main.pydeepmd/pt_expt/train/training.pysource/tests/pt_expt/test_change_bias.py
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5330 +/- ##
==========================================
- Coverage 82.42% 82.28% -0.15%
==========================================
Files 784 797 +13
Lines 79125 82101 +2976
Branches 3676 4003 +327
==========================================
+ Hits 65220 67556 +2336
- Misses 12732 13337 +605
- Partials 1173 1208 +35 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The per-test autouse fixture _clear_leaked_device_context runs before each test method but not before setUpClass. Tests that call trainer.run() in setUpClass (e.g. TestChangeBias) hit a spurious "Torch not compiled with CUDA enabled" error because a leaked DeviceContext(cuda:127) from test collection reroutes torch.tensor() calls without device= to a fake CUDA device. Add a session-scoped fixture that clears leaked DeviceContext modes once at session start, before any setUpClass runs.
The PT backend calls get_fitting_net().compute_input_stats() after change_out_bias with set-by-statistic mode to update fitting normalization. The pt_expt backend was missing this, which would leave stale fitting stats when using dp change-bias -m set. Add tests verifying compute_input_stats is called for set-by-statistic and not called for change-by-statistic.
…pling When numb_batch=0, the code used max(data.get_nbatches()) which makes every smaller system wrap and repeat batches until matching the largest, overweighting short systems. Use min() instead so no system wraps, matching PT backend behavior.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
source/tests/pt_expt/test_change_bias.py (2)
44-52: Minor:run_dpsplits by whitespace.The
cmd.split()approach won't correctly handle paths containing spaces. This is fine for these tests since tmpdir paths are unlikely to have spaces, but worth noting.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/pt_expt/test_change_bias.py` around lines 44 - 52, The helper function run_dp currently uses naive whitespace splitting (cmd.split()) which breaks when file paths contain spaces; update run_dp to parse the command string safely (e.g., use shlex.split or accept a list) so it preserves quoted/space-containing paths before calling main(cmds) and keep the error branch that checks the "dp" prefix; reference the run_dp function and the main(...) call when making the change.
173-192: Minor: Redundant file opening.The code creates a
NamedTemporaryFilethen immediately opens it again with a separateopen()call. While this works, it's slightly redundant.Simpler approach
def test_change_bias_with_data_sys_file(self) -> None: - tmp_file = tempfile.NamedTemporaryFile( - delete=False, suffix=".txt", dir=self.tmpdir - ) - with open(tmp_file.name, "w") as f: - f.writelines([sys + "\n" for sys in self.data_file]) + tmp_file_path = os.path.join(self.tmpdir, "systems.txt") + with open(tmp_file_path, "w") as f: + f.writelines([sys + "\n" for sys in self.data_file]) output_path = os.path.join(self.tmpdir, "model_file_bias.pt") run_dp( f"dp --pt-expt change-bias {self.model_path} " - f"-f {tmp_file.name} -o {output_path}" + f"-f {tmp_file_path} -o {output_path}" )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/pt_expt/test_change_bias.py` around lines 173 - 192, The test redundantly re-opens the NamedTemporaryFile; instead write the lines directly to the NamedTemporaryFile object (tmp_file) when creating it (or create it with mode="w") and close/flush before calling run_dp; update test_change_bias_with_data_sys_file to write sys lines to tmp_file (not via a separate open), then use run_dp(...) and _load_model_from_ckpt(output_path) as before to compare get_out_bias() against original_bias.deepmd/pt_expt/entrypoints/main.py (1)
326-333: Consider using explicitValueErrorinstead ofassertfor input validation.Using
assertfor validating user input is fragile because assertions are stripped when Python runs with optimization flags (-Oor-OO). For CLI tools where users control input, explicit exceptions provide more reliable validation.Proposed fix
if bias_value is not None: - assert "energy" in model_to_change.model_output_type(), ( - "User-defined bias is only available for energy model!" - ) - assert len(bias_value) == len(type_map), ( + if "energy" not in model_to_change.model_output_type(): + raise ValueError("User-defined bias is only available for energy model!") + if len(bias_value) != len(type_map): + raise ValueError( f"The number of elements in the bias should be the same as " f"that in the type_map: {type_map}." - ) + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/entrypoints/main.py` around lines 326 - 333, Replace the two input-validation assert statements guarding bias_value with explicit exceptions: check if "energy" is in model_to_change.model_output_type() and if len(bias_value) == len(type_map), and raise ValueError with the same descriptive messages when a check fails (instead of using assert). Update the block that references bias_value, model_to_change.model_output_type(), and type_map to perform these conditional checks and raise ValueError so validation remains active under optimized Python runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@deepmd/pt_expt/entrypoints/main.py`:
- Around line 326-333: Replace the two input-validation assert statements
guarding bias_value with explicit exceptions: check if "energy" is in
model_to_change.model_output_type() and if len(bias_value) == len(type_map), and
raise ValueError with the same descriptive messages when a check fails (instead
of using assert). Update the block that references bias_value,
model_to_change.model_output_type(), and type_map to perform these conditional
checks and raise ValueError so validation remains active under optimized Python
runs.
In `@source/tests/pt_expt/test_change_bias.py`:
- Around line 44-52: The helper function run_dp currently uses naive whitespace
splitting (cmd.split()) which breaks when file paths contain spaces; update
run_dp to parse the command string safely (e.g., use shlex.split or accept a
list) so it preserves quoted/space-containing paths before calling main(cmds)
and keep the error branch that checks the "dp" prefix; reference the run_dp
function and the main(...) call when making the change.
- Around line 173-192: The test redundantly re-opens the NamedTemporaryFile;
instead write the lines directly to the NamedTemporaryFile object (tmp_file)
when creating it (or create it with mode="w") and close/flush before calling
run_dp; update test_change_bias_with_data_sys_file to write sys lines to
tmp_file (not via a separate open), then use run_dp(...) and
_load_model_from_ckpt(output_path) as before to compare get_out_bias() against
original_bias.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 0c45935d-c7f5-43c0-ac8a-a6172230d7e5
📒 Files selected for processing (3)
deepmd/pt_expt/entrypoints/main.pydeepmd/pt_expt/train/training.pysource/tests/pt_expt/test_change_bias.py
🚧 Files skipped from review as they are similar to previous changes (1)
- deepmd/pt_expt/train/training.py
Resolve conflict in main.py: keep both change-bias and compress command dispatches.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
deepmd/pt_expt/entrypoints/main.py (1)
228-228: Unused parametermodel_branch.This parameter is accepted but never referenced in the function body. The function raises
NotImplementedErrorfor multi-task models beforemodel_branchwould be needed, making this dead code.Consider removing until multi-task support is implemented, or add a TODO comment documenting the intended use.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/entrypoints/main.py` at line 228, The parameter model_branch in the function signature is unused and effectively dead code; remove model_branch from the function signature (and any related type hints or docstrings) and update any callers to stop passing it, or if you prefer to keep it for future multi-task support, revert removal and add a clear TODO comment next to model_branch explaining its intended purpose and when it will be used; locate occurrences by searching for the parameter name (model_branch) and the containing function (the function signature in entrypoints/main.py) to make the change consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/pt_expt/entrypoints/main.py`:
- Around line 326-333: Replace the two assert-based checks around bias_value
with explicit exceptions: validate bias_value is not None then check that
"energy" is present in model_to_change.model_output_type() and raise a TypeError
(or ValueError) with the existing message if not; next verify len(bias_value) ==
len(type_map) and raise a ValueError with the provided formatted message if it
fails; this ensures input validation cannot be bypassed by running Python with
optimizations.
---
Nitpick comments:
In `@deepmd/pt_expt/entrypoints/main.py`:
- Line 228: The parameter model_branch in the function signature is unused and
effectively dead code; remove model_branch from the function signature (and any
related type hints or docstrings) and update any callers to stop passing it, or
if you prefer to keep it for future multi-task support, revert removal and add a
clear TODO comment next to model_branch explaining its intended purpose and when
it will be used; locate occurrences by searching for the parameter name
(model_branch) and the containing function (the function signature in
entrypoints/main.py) to make the change consistently.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5be44217-2850-43df-8101-70d536409303
📒 Files selected for processing (2)
deepmd/pt_expt/entrypoints/main.pydeepmd/pt_expt/train/training.py
🚧 Files skipped from review as they are similar to previous changes (1)
- deepmd/pt_expt/train/training.py
get_out_bias() returns a torch.Tensor on CUDA, which can't be directly converted to numpy. Add to_numpy() helper that calls .detach().cpu() before .numpy().
Assertions are stripped with python -O. Use explicit exceptions for CLI-facing user input validation.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/pt_expt/entrypoints/main.py`:
- Around line 361-367: The code treats numb_batch==0 as "shortest system" by
using nbatches = min(data.get_nbatches()), which incorrectly truncates longer
systems; change the path for numb_batch==0 to implement a true "all batches"
behavior by letting nbatches be the full per-system batch counts (assign
nbatches = data.get_nbatches() instead of min(...)) and then update
make_stat_input(data, nbatches) to accept a sequence of per-system batch counts
(or add a new helper that iterates systems and samples all their batches) so
systems with more batches are not silently truncated; touch the symbols
numb_batch, nbatches, data.get_nbatches(), and make_stat_input to ensure the
call and implementation handle either an int (existing behavior) or a per-system
sequence (all-batches behavior).
- Line 309: The current assignment bias_adjust_mode = "change-by-statistic" if
mode == "change" else "set-by-statistic" silently treats any non-"change" value
(including typos) as "set", so update the helper to validate mode explicitly:
check that mode is one of the allowed values ("change" or "set") and raise a
ValueError (or similar) for invalid modes, then map "change" ->
"change-by-statistic" and "set" -> "set-by-statistic"; reference the
bias_adjust_mode variable and the existing change_out_bias() mode validation to
keep behavior consistent.
- Around line 376-382: The code currently replaces the updated
wrapper.state_dict() "_extra_state" with the stale extra_state variable; instead
preserve the wrapper's updated metadata after set_out_bias()/change_out_bias().
Update the block that handles old_state_dict/model_to_change so that you assign
wrapper.state_dict() back into old_state_dict["model"] (or old_state_dict) and
only merge extra_state into "_extra_state" when the wrapper state does not
already contain it — i.e., prefer wrapper.state_dict()["_extra_state"] (from
ModelWrapper.state_dict()) over the old extra_state, or only set "_extra_state"
= extra_state if "_extra_state" is missing. Ensure you reference wrapper,
model_to_change, old_state_dict, extra_state, and wrapper.state_dict() when
making this change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 8118a87c-a916-47fb-b2df-25b8dd493520
📒 Files selected for processing (2)
deepmd/pt_expt/entrypoints/main.pysource/tests/pt_expt/test_change_bias.py
✅ Files skipped from review due to trivial changes (1)
- source/tests/pt_expt/test_change_bias.py
…nge_bias Replace assert statements with explicit ValueError for input validation (bias_value length, energy model check) since assertions are stripped with python -O. Add explicit mode validation to reject typos instead of silently defaulting to set-by-statistic.
…eat-pt-expt-change-bias # Conflicts: # deepmd/pt_expt/entrypoints/main.py
…ge-bias # Conflicts: # deepmd/pt_expt/train/training.py
…n pte test test_change_bias_frozen_pte passed raw tensors from get_out_bias() to np.allclose, which fails on CUDA with "can't convert cuda:0 device type tensor to numpy". Use the existing to_numpy helper.
Session-scoped conftest fixture doesn't run before unittest setUpClass when the full test suite is collected. Call _pop_device_contexts() explicitly at the start of setUpClass to prevent CUDA init errors from stale DeviceContext(cuda:127) during trainer.run() on CPU runners.
Summary
Add
dp --pt-expt change-biascommand to adjust the output bias (energy shift per atom type) of pt_expt models without retraining. This brings the pt_expt backend to parity with pt/tf/pd backends for this feature.Supported input/output formats
.ptcheckpoint.ptcheckpoint.ptefrozen model.ptefrozen modelBias modes
-s <data_dir>or-f <data_file>): compute new bias from data via linear regression (change-by-statisticorset-by-statistic)-b 0.1 3.2 ...): set bias values directlyImplementation details
deepmd/pt_expt/entrypoints/main.py—change_bias()function + CLI dispatch:.ptinput:torch.load→ extractmodel_paramsfrom_extra_state→get_model()→ModelWrapper.load_state_dict()→ apply bias →torch.save().pteinput:serialize_from_file()→BaseModel.deserialize()→ apply bias →model.serialize()→deserialize_to_file()DeepmdDataSystem+make_stat_input(numpy-based, fromdeepmd.utils.model_stat)numb_batch=0(default), uses all available batches viamax(data.get_nbatches())deepmd/pt_expt/train/training.py—model_change_out_bias()helper:change_out_bias()DPModelCommon/compute_input_statscheck needed since pt_expt models inherit dpmodel's implementation directlyUsage
Test plan
python -m pytest source/tests/pt_expt/test_change_bias.py -v— 4 end-to-end CLI tests:test_change_bias_with_data— bias changes when using-sflagtest_change_bias_with_data_sys_file— bias changes when using-fflagtest_change_bias_with_user_defined— exact match with user-specified valuestest_change_bias_frozen_pte— freeze → change-bias on.pte→ verify bias changedpython -m pytest source/tests/consistent/model/test_ener.py -k test_change_out_bias -v— cross-backend consistency passesSummary by CodeRabbit
New Features
change-biascommand to adjust a model's output bias by supplying values or computing statistics from systems; supports checkpoint and frozen model formats while preserving model metadata.Tests