test(pt_expt): add .pt2 (AOTInductor) unit tests and bug fixes#5334
test(pt_expt): add .pt2 (AOTInductor) unit tests and bug fixes#5334wanghan-iapcm wants to merge 48 commits intodeepmodeling:masterfrom
Conversation
- add support for serialization to .pt2 - deep eval with .pt2 - C/C++ inference api with .pt2 - add ut for python inference - add ut for c++/c inference: se_e2_a, fparam/aparam, multi-frames, dpa1
Replace flat (nf*nall,) indexing in dpa1.py and exclude_mask.py with xp_take_along_axis and xp_take_first_n. The flat indexing creates Ne(nall, nloc) assertions during torch.export tracing that fail when nall == nloc (NoPbc). The new gather-based approach avoids these shape constraints. Unify PT/pt_expt test models so .pth and .pt2 are generated from the same dpmodel weights (type_one_side=True) via gen scripts. Remove deeppot_sea.pth, deeppot_dpa.pth, and fparam_aparam.pth from git tracking — they are now regenerated in CI via convert-models.sh. Changes: - dpa1.py, exclude_mask.py: xp_take_along_axis replaces flat indexing - gen_dpa1.py: generates deeppot_dpa1.pth + .pt2 from dpmodel - gen_fparam_aparam.py: generates fparam_aparam.pth + .pt2 from dpmodel - convert-models.sh: regenerate .pth from yaml and run gen scripts in CI - test_deeppot_dpa_ptexpt.cc: add NoPbc test fixture - test_deeppot_dpa1_pt.cc: new DPA1 .pth test (PBC + NoPbc) - test_deeppot_a_fparam_aparam_pt.cc: update reference values
…ckends Fix a latent bug in all 5 C++ backends (DeepPotPT, DeepPotPTExpt, DeepPotJAX, DeepSpinPT, DeepPotPD) where the ghost-to-local mapping was incorrectly indexed through fwd_map when virtual atoms are present. The old code `mapping[ii] = lmp_list.mapping[fwd_map[ii]]` treats the post-filter index as an original index, causing undefined behavior when fwd_map contains -1 entries for filtered atoms. The fix uses bkw_map to correctly translate: new_idx -> orig_idx -> mapped_orig -> new_mapped. Also fix a use-after-free in DeepPotPTExpt.cc where torch::from_blob referenced a local vector's data after the vector went out of scope. Additionally fix dpmodel code (dpa2.py, repformers.py, make_model.py, nlist.py) to use xp_take_first_n and xp_expand_dims instead of slice indexing that creates shape constraints incompatible with torch.export. Add DPA2 C++ test files for both .pth and .pt2 backends with PBC, NoPbc, and type_sel test cases. Reduce DPA2 test model size for machine-precision agreement between jit and export paths.
Fix torch.export-incompatible slicing in dpmodel DPA3/repflows code ([:, :nloc] → xp_take_first_n, reshape with -1 → expand_dims), add gen_dpa3.py model generation script, and create C++ test files for both .pth and .pt2 backends with PBC and NoPbc fixtures.
Restore deeppot_sea.pth and deeppot_dpa.pth to git tracking — CI cannot regenerate them because tabulate_fusion_se_t_tebd custom op is not available in the Python-only CI environment. Remove the dp convert-backend .pth lines from convert-models.sh. Add try/except around .pth export in gen_dpa1.py and gen_fparam_aparam.py so the scripts don't fail when custom ops are unavailable — only .pt2 generation is required.
make_fx tracing through torch.autograd.grad fails on CUDA with "opt_ready_stream && opt_parent_stream INTERNAL ASSERT FAILED". Fix by running make_fx on CPU (the graph is device-agnostic), then moving sample inputs to the target device before torch.export.export and AOTInductor compilation so the compiled package targets the correct device (GPU when available).
make_fx with _allow_non_fake_inputs=True on CUDA triggers autograd stream assertion (opt_ready_stream && opt_parent_stream). Fix by: 1. Tracing via make_fx on CPU (decomposes autograd.grad into aten ops) 2. Extracting output keys from CPU-traced module 3. Moving traced module + inputs to target device (CUDA if available) 4. Running torch.export.export on target device (the traced graph has no autograd.grad calls, so no CUDA stream issue) This ensures AOTInductor compiles for the correct device.
The -fsanitize=leak flag produces sanitized .so files that fail to dlopen in Python processes lacking the LSAN runtime. Preload liblsan.so via LD_PRELOAD when CXXFLAGS contains the leak sanitizer. Also use os.path.realpath in gen scripts for robust path resolution and print diagnostic on library loading failure instead of silently passing.
Replace manual traced-module-to-device approach with the official torch.export.passes.move_to_device_pass API. This properly rewrites device='cpu' constants baked into the FX graph by make_fx, avoiding FakeTensor device propagation errors on CUDA (pytorch/pytorch#121761).
- C++ JSON/ZIP parser: add EOF checks, bounds validation, stod exception handling, int64_t loop var to prevent unsigned underflow - C++ from_blob: add .clone() to all calls to prevent use-after-free - C++ translate_error: catch std::exception instead of std::runtime_error - Python deep_eval/serialization: validate ZIP entries before reading - gen_*.py: move custom op loading after deepmd imports to avoid double registration; add inductor compiler fallbacks for CI; remove unused subprocess imports
LSAN runtime inherited by g++ subprocesses causes false leak reports and non-zero exit codes, making torch._inductor think no C++ compiler works. Clear LD_PRELOAD after custom ops are loaded since it's no longer needed.
Add has_default_fparam support to DeepPotPTExpt: - Embed has_default_fparam in model_def_script.json metadata during freeze (_collect_metadata) - Read it in C++ init from metadata, with fallback to false for older .pt2 files - Add C++ tests for both has_default_fparam==false and ==true - Generate fparam_aparam_default.pt2 test model with default_fparam set
- Guard EOCD scan against short/corrupt .pt2 files (P1) - Parse ZIP64 fields using full 64-bit widths (P2) - Match ZIP entry names by suffix for archives with dir prefixes - Reject unsupported in-memory file_content in DeepPotPTExpt::init - Replace debug-only assert with runtime exception for output count - Fix virtual atom coordinate copy: nvir -> nvir*3 in 3 test files - Tighten DPA2 float EPSILON from 1e-1 to 1e-4, add TODO for ghost - Narrow exception handling in gen_dpa3.py custom op loading - Add parity assertions (not just prints) in gen_dpa3.py/.py exports - Guard .pth parity check on successful export in gen_fparam_aparam.py - Extend parity check to virial and atomic outputs - Update deep_eval.py comment on private torch API
The gen_*.py scripts import torch, which has internal pybind/JIT allocations that LSAN reports as leaks. These are not our leaks. Previously, LD_PRELOAD=liblsan.so was set for the gen scripts to handle dlopen of the sanitizer-built custom op .so, but this causes LSAN to track all Python/torch allocations and fail on exit. Fix: run gen scripts without LD_PRELOAD. The gen scripts don't need LSAN — only the C++ test binary (ctest) does, and ctest already inherits LSAN via the -fsanitize=leak compile flag. Also revert the DPA2 cpu_lmp_nlist test (needs further investigation).
- Remove unused variable nloc in gen_fparam_aparam.py - Cast int multiplications to size_t before use as iterator offsets in DeepPotPTExpt multi-frame compute to prevent overflow
The gen scripts need LD_PRELOAD=liblsan.so to dlopen the custom op .so built with -fsanitize=leak. But LSAN leak detection must be disabled (detect_leaks=0) to avoid false reports from torch/paddle internal allocations.
The AOTInductor model_package_loader.h header may not exist on all platforms (e.g. macOS x86_64 torch builds). Use __has_include to detect it and define BUILD_PT_EXPT accordingly, so DeepPotPTExpt compiles as an empty TU when the header is missing.
The _load_custom_ops() call was placed before deepmd.pt was imported, so the guard `hasattr(torch.ops.deepmd, "border_op")` didn't see the op registered by the standard install path. This caused the build directory's .so to be loaded, and then when deepmd.pt was imported later for .pth export, it loaded the same op again — crash. Fix: move _load_custom_ops() to after the deepmd.pt import so the guard correctly detects the already-registered op and skips.
The gen scripts import deepmd.pt which loads custom ops from deepmd/lib/. Without the .so installed there, the custom op (tabulate_fusion_se_t_tebd) is unavailable and .pth export fails for DPA2/DPA3 models, causing C++ tests to fail on missing files. Fix: copy libdeepmd_op_pt.so from the build directory to the deepmd package's lib/ directory after cmake --build, before running gen scripts.
The custom op .so (libdeepmd_op_pt.so) has undefined symbols for compute kernels (tabulate_fusion_se_t_tebd_cpu etc.) that live in libdeepmd.so. Without dp_test/lib/ in LD_LIBRARY_PATH, dlopen succeeds (lazy binding) but torch.jit.script fails when it tries to resolve the symbols.
The script was copying libdeepmd_op_pt.so to os.path.dirname(deepmd.__file__)+"/lib" which resolves to the workspace source tree, but cxx_op.py loads from SHARED_LIB_DIR (deepmd.lib.__path__) which resolves to site-packages. For editable installs these diverge, so the custom ops were never loaded and ENABLE_CUSTOMIZED_OP was False. This caused 10 C++ test failures: - 8 DPA2 .pth tests: tabulate_fusion_se_t_tebd has no Python fallback - 2 DPA3 .pth NoPbc tests: border_op fallback raises NotImplementedError
…_local.sh Move custom op .so install, LSAN setup, and gen script invocations into a single ENABLE_PYTORCH guard block. Update comment to reference SHARED_LIB_DIR instead of deepmd/lib/.
- common.h fold_back: use ptrdiff_t for loop variables to prevent int multiplication overflow before widening to difference_type - serialization.py: unify import style for deepmd.pt_expt.utils.env (use `import ... as _env` consistently, remove `from ... import DEVICE`)
Without the custom op .so loaded, torch.jit.script fails on se_t_tebd with "object has no attribute tabulate_fusion_se_t_tebd". Add the same NotImplementedError fallback pattern used by tabulate_fusion_se_t, tabulate_fusion_se_a, tabulate_fusion_se_atten, and border_op.
The previous fix only changed loop variables but nall*ndim and nloc*ndim were still computed as int*int before widening. Cast all parameters to ptrdiff_t at function entry.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 50507c6af7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #5334 +/- ##
==========================================
- Coverage 82.28% 82.25% -0.03%
==========================================
Files 797 797
Lines 82100 82323 +223
Branches 4003 4020 +17
==========================================
+ Hits 67557 67717 +160
- Misses 13336 13393 +57
- Partials 1207 1213 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Store default_fparam values in .pt2 metadata during serialization - Python inference: when fparam is omitted and model has default, substitute default values from metadata instead of passing None - C++ inference: read default_fparam from metadata and use when fparam is empty (without fix: segfault from shape mismatch) - NoPBC: shift coordinates so minimum is at rcut, ensuring atoms with negative coords are inside [0, L) for copy_coord/build_nlist - Add C++ tests: eval_default_vs_explicit (internal nlist) and eval_default_vs_explicit_lmp (external nlist) - Add Python tests: test_pt2_eval_default_fparam and test_freeze_pt2_nopbc_negative_coords
Resolve conflict in serialization.py: keep .pt2 dispatch in serialize_from_file and add model_params.json support from upstream (finetune --use-pretrain-script) to both .pte and .pt2 paths.
- NoPbc cpu_build_nlist_atomic: verify atomic energy/virial for NoPBC - PBC cpu_build_nlist_nframes: 2-frame PBC via compute_mixed_type - NoPbc cpu_build_nlist_nframes: 2-frame NoPBC via compute_mixed_type These cover previously untested code paths in DeepPotPTExpt::compute (NoPBC internal nlist) and compute_nframes (multi-frame wrapper).
- test_finetune_from_pt2: train → freeze to .pt2 → finetune from .pt2 → verify inherited weights match pretrained - test_finetune_from_pt2_use_pretrain_script: train → freeze to .pt2 → finetune with --use-pretrain-script → verify config updated
…Expt - Parser error tests: nonexistent file, invalid ZIP, truncated file - Metadata accessor tests: type_map, cutoff, ntypes, dim_fparam/aparam - JSON type coverage: integers, booleans, strings, float, arrays via SE_A model (no fparam) and fparam_aparam model - Default fparam JSON parsing: boolean true + float array via fparam_aparam_default model
Resolve conflicts: keep default_fparam logic, NoPBC coord shift, and new tests from our branch on top of upstream's merged feat-pt-expt-c.
- test_change_bias_frozen_pt2: change-bias with data on .pt2 model - test_change_bias_frozen_pt2_user_defined: user-defined bias on .pt2 - test_change_bias_pt2_pte_consistency: .pte and .pt2 produce same bias
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1a45b88af9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds exporter-stored model Changes
Sequence Diagram(s)sequenceDiagram
participant Exporter as Exporter\n(Py)
participant Metadata as Metadata\n(JSON)
participant InferPy as Infer\n(Py)
participant Runtime as Runtime\n(C++)
participant Runner as Runner\n(Model)
Exporter->>Metadata: write model + "default_fparam"
InferPy->>Metadata: load model + metadata
alt caller omits fparam and dim_fparam>0
InferPy->>InferPy: build fparam_t from metadata["default_fparam"]\n(dtype=float64, broadcast to nframes)
end
InferPy->>Runtime: call compute(fparam_t)
alt fparam empty in C++ compute
Runtime->>Runtime: construct fparam tensor from default_fparam_\n(device float64, broadcast)
end
Runtime->>Runner: invoke model runner with fparam tensor
Runner-->>Runtime: energies, forces, virials
Runtime-->>InferPy: return results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
deepmd/pt_expt/utils/serialization.py (1)
188-199:⚠️ Potential issue | 🔴 CriticalConvert
default_fparamtensor to JSON-serializable format before storage.
model.get_default_fparam()returns atorch.Tensor | None, but the metadata dict is serialized viajson.dumps()which cannot serialize tensors. When a model with a default fparam is frozen, this will raiseTypeError: Object of type Tensor is not JSON serializable. Convert the tensor to a list or None:Suggested fix
+ default_fparam = model.get_default_fparam() return { "type_map": model.get_type_map(), "rcut": model.get_rcut(), "sel": model.get_sel(), "model_output_type": model.model_output_type(), "dim_fparam": model.get_dim_fparam(), "dim_aparam": model.get_dim_aparam(), "mixed_types": model.mixed_types(), "sel_type": model.get_sel_type(), "has_default_fparam": model.has_default_fparam(), - "default_fparam": model.get_default_fparam(), + "default_fparam": ( + None + if default_fparam is None + else default_fparam.detach().cpu().tolist() + ), "fitting_output_defs": fitting_output_defs, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/utils/serialization.py` around lines 188 - 199, The metadata dict contains "default_fparam" which is assigned model.get_default_fparam() (a torch.Tensor or None) and will fail json.dumps; update the serialization to call model.get_default_fparam(), check for None, and if not None convert the tensor to a plain Python list (e.g., via .tolist()) before inserting into the dict so "default_fparam" is JSON-serializable; modify the code around the return dict (the keys "default_fparam" and related model.get_default_fparam()) to perform this conversion.
🧹 Nitpick comments (1)
source/api_cc/tests/test_deeppot_ptexpt.cc (1)
656-669: DropTYPED_TESTfrom the metadata-only suites.
TypeParamis unused in these fixtures and assertions, so the same.pt2files get loaded once perValueTypesentry without increasing coverage. Converting these toTEST_Fwould keep the checks and reduce duplicate runtime/noise.Also applies to: 699-712, 735-748
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/api_cc/tests/test_deeppot_ptexpt.cc` around lines 656 - 669, The test fixture TestDeepPotPTExptMetadata (and the other metadata-only fixtures noted) doesn't use TypeParam, so keep a single instantiation: remove TYPED_TEST_SUITE/TYPED_TEST usage and convert the suite to a plain TEST_F-style test class; e.g., replace the TYPED_TEST_SUITE(TestDeepPotPTExptMetadata, ValueTypes) and any TYPED_TESTs that use it with regular TEST_F tests that call dp.init("../../tests/infer/deeppot_sea.pt2") in SetUp, and delete any unused TypeParam references; apply the same change for the other metadata-only fixtures referenced in the comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@source/api_cc/src/DeepPotPTExpt.cc`:
- Around line 828-836: The code currently creates a zero-length fparam_tensor
when default_fparam_ is empty which breaks the AOTInductor model if dfparam > 0;
modify the fparam_tensor construction (where default_fparam_ and fparam_tensor
are handled) to check dfparam: if dfparam > 0 create a zeros tensor shaped {1,
dfparam} with the same options and .to(device), otherwise keep the zero-length
zeros({0}) behavior; apply the identical change to both places where
fparam_tensor is built (the two occurrences around the original diff and the
later one referenced in the comment) and ensure this aligns with how run_model()
conditions inclusion of fparam in the input list.
In `@source/api_cc/tests/test_deeppot_ptexpt.cc`:
- Around line 631-638: The test writes a temporary file then calls
deepmd::DeepPot::init(tmpfile) but doesn't assert the write succeeded, so
failures to create the file can mask invalid-ZIP handling; modify the test to
(1) use a unique temp path (e.g., include PID/timestamp or tmpnam-style string)
for tmpfile, (2) after creating std::ofstream ofs(tmpfile, std::ios::binary)
assert ofs.is_open() and that the write succeeded (e.g., check ofs.good() or
verify bytes written) before closing the scope, and then call dp.init(tmpfile)
and remove the file; apply the same change to the other similar block around the
dp.init call.
- Around line 541-578: The test cpu_build_nlist_nframes is appending an
identical second frame so it won't catch frame-specific bugs; change the
second-frame inputs (coord_2f, atype_2f, and box_2f for PBC case) to be a
perturbed copy (e.g., shift one atom coordinate and adjust box if needed), then
after calling dp.compute_mixed_type(...) verify each frame by calling
dp.compute(...) on the single modified frame and comparing that single-frame
energy/force/virial to the corresponding slice from the batched outputs; apply
the same change to the other batched test referenced (the block around 581-615)
and use symbol names dp.compute_mixed_type and dp.compute to locate the code to
modify.
In `@source/tests/pt_expt/model/test_model_compression.py`:
- Around line 340-343: The test unpacks e_pt2, f_pt2, v_pt2 from dp.eval(coord,
box, atype) but only asserts energy, leaving f_pt2 and v_pt2 unused (Ruff error
and missing derivative checks); either add assertions comparing f_pt2 and v_pt2
to ret_frozen["forces"] and ret_frozen["virial"] (using
np.testing.assert_allclose with appropriate atol and err_msg) or, if you intend
energy-only coverage, change the unpack to e_pt2, _, _ = dp.eval(coord, box,
atype) to avoid unused variables; update the lines around dp.eval and the
existing energy assertion accordingly.
In `@source/tests/pt_expt/test_dp_freeze.py`:
- Around line 124-126: The test calls dp.eval(coord, box, atype) and assigns (e,
f, v) but never uses v, which triggers RUF059 and omits the virial shape check;
fix by either asserting the virial shape (e.g., use self.assertEqual(v.shape,
(1, 1)) or the correct expected shape) or, if you intentionally don't check it,
rename v to _v to signal it's unused; update the lines around dp.eval and the
subsequent self.assertEqual calls accordingly and run ruff check . and ruff
format . before committing.
In `@source/tests/pt_expt/test_finetune.py`:
- Around line 895-935: The test is reloading the finetuned checkpoint as the
"pretrained" baseline because ckpt_path (created by _train_pretrained) points to
the same model.ckpt.pt that the finetune run overwrites; before calling
main([...,"--finetune", pt2_path,...]) capture and cache the pretrained state
returned by torch.load(ckpt_path, map_location=DEVICE, weights_only=True) into
pre_model_state (or save it to a uniquely named file) and then run main so that
_assert_inherited_weights_match(ft_model_state, pre_model_state,
random_fitting=False) compares the original pretrained weights to the finetuned
output instead of reloading the overwritten file; alternatively, change the
finetune output path (ft_ckpt) to a different filename so ckpt_path remains
untouched.
---
Outside diff comments:
In `@deepmd/pt_expt/utils/serialization.py`:
- Around line 188-199: The metadata dict contains "default_fparam" which is
assigned model.get_default_fparam() (a torch.Tensor or None) and will fail
json.dumps; update the serialization to call model.get_default_fparam(), check
for None, and if not None convert the tensor to a plain Python list (e.g., via
.tolist()) before inserting into the dict so "default_fparam" is
JSON-serializable; modify the code around the return dict (the keys
"default_fparam" and related model.get_default_fparam()) to perform this
conversion.
---
Nitpick comments:
In `@source/api_cc/tests/test_deeppot_ptexpt.cc`:
- Around line 656-669: The test fixture TestDeepPotPTExptMetadata (and the other
metadata-only fixtures noted) doesn't use TypeParam, so keep a single
instantiation: remove TYPED_TEST_SUITE/TYPED_TEST usage and convert the suite to
a plain TEST_F-style test class; e.g., replace the
TYPED_TEST_SUITE(TestDeepPotPTExptMetadata, ValueTypes) and any TYPED_TESTs that
use it with regular TEST_F tests that call
dp.init("../../tests/infer/deeppot_sea.pt2") in SetUp, and delete any unused
TypeParam references; apply the same change for the other metadata-only fixtures
referenced in the comment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a171353c-1edf-4ab1-9928-d0e66a75b8aa
📒 Files selected for processing (9)
deepmd/pt_expt/infer/deep_eval.pydeepmd/pt_expt/utils/serialization.pysource/api_cc/include/DeepPotPTExpt.hsource/api_cc/src/DeepPotPTExpt.ccsource/api_cc/tests/test_deeppot_a_fparam_aparam_ptexpt.ccsource/api_cc/tests/test_deeppot_ptexpt.ccsource/tests/pt_expt/model/test_model_compression.pysource/tests/pt_expt/test_dp_freeze.pysource/tests/pt_expt/test_finetune.py
- deep_eval.py: add .contiguous() on expanded default fparam; raise ValueError when no default available instead of silent zero injection - test_deeppot_ptexpt.cc: use distinct coord_alt for frame 1 in multi-frame tests; assert temp file creation in parser error tests - test_model_compression.py: assert force/virial on .pt2 compress path - test_dp_freeze.py: assert virial shape in .pt2 smoke test - test_finetune.py: copy pretrained checkpoint before finetune overwrites
…fault
Consistent with the Python side: throw deepmd_exception instead of
passing a shape-mismatched {0} tensor that would crash in AOTInductor.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@source/tests/pt_expt/test_change_bias.py`:
- Around line 311-315: The test only asserts the first two entries of the
deserialized bias (using updated_bias.flatten()[:2]) which can miss
extra/mis-shaped values; replace that partial check with a full-array comparison
by reshaping the expected user bias to match updated_bias.shape and using
np.testing.assert_allclose(updated_bias, expected_bias) so the entire bias
tensor from BaseModel.get_out_bias (after serialize_from_file and
BaseModel.deserialize) is validated.
In `@source/tests/pt_expt/test_dp_freeze.py`:
- Around line 159-191: The test test_freeze_pt2_nopbc_negative_coords currently
only compares pt2 vs pte (dp_pt2.eval vs dp_pte.eval) so it won't detect
regressions in the shared frontend; instead create an independent oracle by
manually shifting the negative coord array into the [0,L) range and using that
shifted_coord as the reference for energy/force/virial, then compare
dp_pt2.eval(coord, None, atype) against the oracle results; concretely, compute
an offset = -coord.min() (per axis or global as appropriate) to produce
shifted_coord = coord + offset, call DeepPot.eval on shifted_coord (or another
trusted reference evaluator) to produce e_ref,f_ref,v_ref, and assert
np.testing.assert_allclose(e_pt2, e_ref, ...), etc.; update the
test_freeze_pt2_nopbc_negative_coords function and keep calls to freeze(...) and
dp_pt2 so the C++ NoPBC fake-box path (DeepPot.eval with box=None) is exercised
while using the shifted_coord oracle for independent validation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 69223829-8114-46ce-9b68-0b038a29b438
📒 Files selected for processing (6)
deepmd/pt_expt/infer/deep_eval.pysource/api_cc/tests/test_deeppot_ptexpt.ccsource/tests/pt_expt/model/test_model_compression.pysource/tests/pt_expt/test_change_bias.pysource/tests/pt_expt/test_dp_freeze.pysource/tests/pt_expt/test_finetune.py
🚧 Files skipped from review as they are similar to previous changes (3)
- source/tests/pt_expt/model/test_model_compression.py
- source/api_cc/tests/test_deeppot_ptexpt.cc
- source/tests/pt_expt/test_finetune.py
- test_change_bias: assert full bias tensor shape, not just first 2 values - test_dp_freeze: clarify NoPBC test compares independent implementations
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
source/tests/pt_expt/test_dp_freeze.py (1)
237-244: Add a non-defaultfparamcontrol so this test cannot pass vacuously.Right now this only proves
omitted == explicit default. Because the checkpoint is synthetic, the model could become effectively insensitive tofparam, and this would still pass even if the fallback plumbing stopped being exercised. One extra non-default call plus a “not all equal” assertion would make the regression meaningful.🧪 Possible hardening
# Eval WITHOUT fparam — model should use default (0.5) e_no, f_no, v_no = dp.eval(coord, box, atype) # Eval WITH explicit default value e_ex, f_ex, v_ex = dp.eval(coord, box, atype, fparam=[0.5]) + e_alt, f_alt, v_alt = dp.eval(coord, box, atype, fparam=[0.0]) + + self.assertFalse( + np.allclose(e_ex, e_alt, rtol=0, atol=1e-12) + and np.allclose(f_ex, f_alt, rtol=0, atol=1e-12) + and np.allclose(v_ex, v_alt, rtol=0, atol=1e-12) + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/pt_expt/test_dp_freeze.py` around lines 237 - 244, The test currently only checks omitted fparam equals the explicit default using dp.eval(coord, box, atype) vs dp.eval(..., fparam=[0.5]) so it can pass vacuously; add one more call using a non-default fparam (e.g., fparam=[0.7]) via dp.eval(coord, box, atype, fparam=[...]) to produce e_alt, f_alt, v_alt and then assert that at least one of e_alt, f_alt, v_alt is not allclose to the default outputs (e_no/f_no/v_no) to ensure the fparam branch is actually exercised. Use the existing variable names (dp.eval, e_no/f_no/v_no, e_ex/f_ex/v_ex) to locate where to insert the extra call and the “not all equal” assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@source/tests/pt_expt/test_dp_freeze.py`:
- Around line 152-157: The parity asserts between outputs of dp_pte.eval and
dp_pt2.eval (variables e_pte, f_pte, v_pte vs e_pt2, f_pt2, v_pt2) use
np.testing.assert_allclose with implicit rtol and may allow NaN-equality; change
each comparison to explicitly pass rtol=0 and keep atol=1e-10, and before
asserting closeness add finiteness checks (e.g. assert np.all(np.isfinite(...))
for both sides of each pair) so NaNs/Infs fail fast; make the same changes for
the other two analogous blocks that compare dp_pte vs dp_pt2 outputs.
---
Nitpick comments:
In `@source/tests/pt_expt/test_dp_freeze.py`:
- Around line 237-244: The test currently only checks omitted fparam equals the
explicit default using dp.eval(coord, box, atype) vs dp.eval(..., fparam=[0.5])
so it can pass vacuously; add one more call using a non-default fparam (e.g.,
fparam=[0.7]) via dp.eval(coord, box, atype, fparam=[...]) to produce e_alt,
f_alt, v_alt and then assert that at least one of e_alt, f_alt, v_alt is not
allclose to the default outputs (e_no/f_no/v_no) to ensure the fparam branch is
actually exercised. Use the existing variable names (dp.eval, e_no/f_no/v_no,
e_ex/f_ex/v_ex) to locate where to insert the extra call and the “not all equal”
assertion.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f9ba4faf-0b6d-445e-b8ef-88bc27a8e961
📒 Files selected for processing (2)
source/tests/pt_expt/test_change_bias.pysource/tests/pt_expt/test_dp_freeze.py
🚧 Files skipped from review as they are similar to previous changes (1)
- source/tests/pt_expt/test_change_bias.py
get_out_bias() returns shape (n_out, ntypes, max_out_size) but the test expected (ntypes, max_out_size). Use .reshape(updated_bias.shape) to match the actual shape, consistent with the .pt test pattern.
Add np.isfinite assertions before allclose comparisons to catch NaN regressions that would otherwise pass if both paths produce matching NaNs. Set rtol=0 so only absolute tolerance is used.
Summary
Comprehensive .pt2 test coverage for the pt_expt backend, plus bug fixes discovered during testing.
Bug fixes
dim_fparam > 0, the C++ and Python inference paths now substitute default values from metadata instead of passing a shape-mismatched tensor (which caused a segfault in AOTInductor)copy_coord/build_nlistptrdiff_tfor all operands infold_backiterator arithmetictabulate_fusion_se_t_tebdPython fallback stub (all other tabulate ops had one)New Python tests
.pt2freeze,.ptevs.pt2eval consistency, NoPBC with negative coords.pt2output → DeepPot eval,min_nbor_distroundtrip.pt2→ finetune from.pt2, with--use-pretrain-script.pt2with data, user-defined values,.pte/.pt2consistencyNew C++ tests
compute_mixed_typeRefactoring
gen_common.py(eliminates duplication across 5 scripts)CXXenv var instead of hardcoded compiler fallback list.soinstall intest_cc_local.shTest plan
Summary by CodeRabbit
New Features
Bug Fixes
Tests