feat(train): support opt-in $ref loading by njzjz-bot · Pull Request #5264 · deepmodeling/deepmd-kit

njzjz-bot · 2026-02-24T19:15:53Z

Summary

Enable dargs $ref loading for training-input validation with explicit opt-in.

What changed

Add CLI option --allow-ref to dp train
Thread allow_ref through PT/PD train entrypoints
Extend deepmd.utils.argcheck.normalize(..., allow_ref=False)
Keep default behavior unchanged (disabled by default)
Bump minimal dependency to dargs>=0.5.0
Document option in doc/train/training-advanced.md
Add docstrings for updated parameters

Security

$ref loading remains disabled by default and must be explicitly enabled.

Authored by OpenClaw (model: gpt-5.3-codex)

Summary by CodeRabbit

Release Notes

New Features
- Added --allow-ref CLI flag to the training command, enabling loading of external JSON/YAML configuration snippets via references (disabled by default for security)
Tests
- Added test to verify --allow-ref flag parsing and default behavior
Documentation
- Updated training documentation to describe the new --allow-ref option
Chores
- Updated package dependency constraint

- add --allow-ref to dp train\n- thread allow_ref through PT/PD train entrypoints and argcheck\n- keep disabled by default for security\n- bump minimum dargs to >=0.5.0\n- document usage in README\n\nAuthored by OpenClaw (model: gpt-5.3-codex)

coderabbitai · 2026-02-24T19:22:49Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

A new CLI flag --allow-ref enables loading external JSON/YAML snippets via $ref references during training configuration normalization, disabled by default for security. The flag is propagated from the CLI through train entry points to the normalization utility, and the dargs dependency is updated to version >=0.5.0.

Changes

Cohort / File(s)	Summary
CLI Flag Definition `deepmd/main.py`	Added `--allow-ref` flag to the train subcommand to enable external reference loading.
Training Entry Points `deepmd/pd/entrypoints/main.py`, `deepmd/pt/entrypoints/main.py`, `deepmd/tf/entrypoints/train.py`	Added `allow_ref: bool = False` parameter to train function signatures and propagated the flag to normalize calls via `allow_ref=FLAGS.allow_ref` in main entry paths and `allow_ref=allow_ref` in normalize calls.
Normalization Utility `deepmd/utils/argcheck.py`	Extended normalize function signature to accept `allow_ref: bool = False` and propagated the parameter through internal calls to `base.normalize_value()` and `base.check_value()`.
Dependency Update `pyproject.toml`	Updated dargs dependency constraint from >=0.4.7 to >=0.5.0 to support external references.
Documentation `doc/train/training-advanced.md`	Added description of new `--allow-ref` CLI option and its security implications.
Test Coverage `source/tests/common/test_argument_parser.py`	Replaced removed test with new test_parser_train_allow_ref to validate parsing of `--allow-ref` flag and its default false value.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 81.82% which is sufficient. The required threshold is 80.00%.
Title check	✅ Passed	The title 'feat(train): support opt-in $ref loading' accurately and concisely summarizes the main change: adding opt-in support for $ref loading in training with secure-by-default behavior.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

- document --allow-ref in training docs\n- add allow_ref docstrings in normalize/train APIs\n\nAuthored by OpenClaw (model: gpt-5.3-codex)

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

deepmd/pt/entrypoints/main.py (1)
268-338: ⚠️ Potential issue | 🟡 Minor

Run ruff check . and ruff format . before commit (CI requirement).

Per repo guidelines for Python changes, please ensure these are run to avoid CI failures.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt/entrypoints/main.py` around lines 268 - 338, The CI failure is due
to missing ruff linting/formatting; run ruff check . to see reported issues and
ruff format . to apply fixes before committing changes in the train function
(deepmd/pt/entrypoints/main.py) and surrounding code; ensure you re-run ruff
check . and rerun tests, then amend the commit so the pull request passes CI.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/utils/argcheck.py`:
- Around line 3824-3840: The CI requires running the repo's linter/formatter:
before committing changes to deepmd/utils/argcheck.py (e.g., edits near the
normalize(...) function), run "ruff check ." to catch lint errors and then "ruff
format ." (or the configured formatter) to apply formatting fixes; re-run the
checks until they pass and then update the commit so CI will succeed.

---

Outside diff comments:
In `@deepmd/pt/entrypoints/main.py`:
- Around line 268-338: The CI failure is due to missing ruff linting/formatting;
run ruff check . to see reported issues and ruff format . to apply fixes before
committing changes in the train function (deepmd/pt/entrypoints/main.py) and
surrounding code; ensure you re-run ruff check . and rerun tests, then amend the
commit so the pull request passes CI.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ea1e8d4 and 82ef1bb.

📒 Files selected for processing (4)

deepmd/pd/entrypoints/main.py
deepmd/pt/entrypoints/main.py
deepmd/utils/argcheck.py
doc/train/training-advanced.md

🚧 Files skipped from review as they are similar to previous changes (1)

deepmd/pd/entrypoints/main.py

deepmd/utils/argcheck.py

Copilot

Pull request overview

Adds an opt-in switch to allow dargs $ref expansion during training input normalization/validation, keeping the default behavior secure-by-default (disabled).

Changes:

Add dp train --allow-ref flag and thread it into PT/PD training normalization.
Extend deepmd.utils.argcheck.normalize(..., allow_ref=False) and bump dargs minimum to >=0.5.0.
Document the new option in training docs and README.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
pyproject.toml	Bumps runtime dependency to `dargs >= 0.5.0` to support `$ref` handling.
doc/train/training-advanced.md	Documents `--allow-ref` in advanced training options.
deepmd/utils/argcheck.py	Adds `allow_ref` parameter and forwards it to `dargs` normalization/validation.
deepmd/pt/entrypoints/main.py	Plumbs `allow_ref` through PyTorch training entrypoint to `normalize(...)`.
deepmd/pd/entrypoints/main.py	Plumbs `allow_ref` through Paddle training entrypoint to `normalize(...)`.
deepmd/main.py	Adds `--allow-ref` to the `dp train` CLI parser.
README.md	Documents `$ref` support and the opt-in mechanism.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

README.md

deepmd/utils/argcheck.py

deepmd/main.py

deepmd/pt/entrypoints/main.py

deepmd/pd/entrypoints/main.py

deepmd/main.py

doc/train/training-advanced.md

- verify opt-in flag is parsed\n- verify default remains disabled\n\nAuthored by OpenClaw (model: gpt-5.3-codex)

njzjz-bot · 2026-02-24T20:03:38Z

Follow-up updates pushed:\n- added parser test coverage for and default-off behavior ()\n- kept implementation backend-adapted: PT/PD train entrypoints consume the new flag, TF path remains unchanged by design\n\nI also removed README-level feature note and kept docs in as requested.

njzjz-bot · 2026-02-24T20:04:33Z

Follow-up (clean summary): addressed backend-adaptation and tests.

Backend adaptation: PT/PD train entrypoints both plumb allow_ref; TF path intentionally unchanged in this PR scope.
Added parser test for dp train --allow-ref and default-off behavior in source/tests/common/test_argument_parser.py.
Kept docs in doc/ (no README feature note).

Local syntax check passed (python3 -m py_compile).

coderabbitai

🧹 Nitpick comments (1)

source/tests/common/test_argument_parser.py (1)

290-296: Consider using the existing run_test helper for consistency.

All other train-flag tests (e.g. test_parser_train_restart, test_parser_train_finetune) use the run_test / TEST_DICT pattern that also exercises default-value paths and attribute-type checks. Expressing --allow-ref through the same pattern keeps the suite uniform and would also validate that the flag co-exists with --restart, --init-model, etc.

♻️ Optional refactor using the existing helper

-    def test_parser_train_allow_ref(self) -> None:
-        """Test train --allow-ref option."""
-        args = parse_args(["train", "INFILE", "--allow-ref"])
-        self.assertTrue(args.allow_ref)
-
-        args_default = parse_args(["train", "INFILE"])
-        self.assertFalse(args_default.allow_ref)
+    def test_parser_train_allow_ref(self) -> None:
+        """Test train --allow-ref option."""
+        ARGS = {
+            "INPUT": {"type": str, "value": "INFILE"},
+            "--allow-ref": {"type": bool},
+        }
+        self.run_test(command="train", mapping=ARGS)
+
+        # Also confirm the explicit True case
+        args = parse_args(["train", "INFILE", "--allow-ref"])
+        self.assertTrue(args.allow_ref)
+
+        args_default = parse_args(["train", "INFILE"])
+        self.assertFalse(args_default.allow_ref)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/common/test_argument_parser.py` around lines 290 - 296, Replace
the two direct parse_args calls in test_parser_train_allow_ref with the existing
run_test/TEST_DICT pattern so the test uses run_test to validate both the flag
set and default behavior and to exercise interactions with other train flags;
specifically, update test_parser_train_allow_ref to invoke run_test (using
TEST_DICT and the "train" command) to assert allow_ref True when "--allow-ref"
is present and False by default, ensuring it goes through the same validation
and attribute-type checks as other tests like test_parser_train_restart and
test_parser_train_finetune rather than calling parse_args directly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@source/tests/common/test_argument_parser.py`:
- Around line 290-296: Replace the two direct parse_args calls in
test_parser_train_allow_ref with the existing run_test/TEST_DICT pattern so the
test uses run_test to validate both the flag set and default behavior and to
exercise interactions with other train flags; specifically, update
test_parser_train_allow_ref to invoke run_test (using TEST_DICT and the "train"
command) to assert allow_ref True when "--allow-ref" is present and False by
default, ensuring it goes through the same validation and attribute-type checks
as other tests like test_parser_train_restart and test_parser_train_finetune
rather than calling parse_args directly.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 82ef1bb and 0e4b85c.

📒 Files selected for processing (1)

source/tests/common/test_argument_parser.py

- ensure --allow-ref works across backends\n- wire allow_ref into TF train normalize path\n- improve docstrings and training docs\n- add parser test for --allow-ref\n\nAuthored by OpenClaw (model: gpt-5.3-codex)

njzjz-bot · 2026-02-24T20:24:12Z

Addressed backend-adaptation/self-check items in latest commits:\n\n- PT/PD path: keep threaded to normalize\n- TF path: now also accepts/threads ()\n- CLI is now backend-consistent\n- parser coverage includes check in \n- removed README edits; docs stay in \n\nI also ran static compile checks locally for touched files.

Keep documentation updates in doc/ only.

Recover accidentally replaced train option while keeping new opt-in ref flag.\n\nAuthored by OpenClaw (model: gpt-5.3-codex)

…heck Restore accidentally replaced parser negative test and keep new allow-ref assertions.\n\nAuthored by OpenClaw (model: gpt-5.3-codex)

codecov · 2026-02-24T22:58:46Z

Codecov Report

❌ Patch coverage is 85.71429% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 82.00%. Comparing base (65eea4b) to head (5a6d724).

Files with missing lines	Patch %	Lines
deepmd/pd/entrypoints/main.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #5264   +/-   ##
=======================================
  Coverage   82.00%   82.00%           
=======================================
  Files         750      750           
  Lines       75082    75082           
  Branches     3615     3615           
=======================================
+ Hits        61571    61574    +3     
+ Misses      12347    12346    -1     
+ Partials     1164     1162    -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions bot added the Python label Feb 24, 2026

dosubot bot added the new feature label Feb 24, 2026

docs(train): add docs and parameter docstrings

82ef1bb

- document --allow-ref in training docs\n- add allow_ref docstrings in normalize/train APIs\n\nAuthored by OpenClaw (model: gpt-5.3-codex)

github-actions bot added the Docs label Feb 24, 2026

njzjz-bot changed the title ~~feat(train): add opt-in `` support with secure defaults~~ feat(train): support opt-in $ref loading (secure by default) Feb 24, 2026

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

deepmd/utils/argcheck.py Show resolved Hide resolved

njzjz requested a review from Copilot February 24, 2026 19:40

Copilot started reviewing on behalf of njzjz February 24, 2026 19:41 View session

Copilot AI reviewed Feb 24, 2026

View reviewed changes

test(cli): add parser coverage for train --allow-ref

0e4b85c

- verify opt-in flag is parsed\n- verify default remains disabled\n\nAuthored by OpenClaw (model: gpt-5.3-codex)

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

fix(train): align allow_ref across TF/PT/PD and docs/tests

cc3fe52

- ensure --allow-ref works across backends\n- wire allow_ref into TF train normalize path\n- improve docstrings and training docs\n- add parser test for --allow-ref\n\nAuthored by OpenClaw (model: gpt-5.3-codex)

njzjz-bot added 3 commits February 24, 2026 21:19

docs: drop README changes for feature

7fcae80

Keep documentation updates in doc/ only.

fix(cli): restore --force-load and keep --allow-ref

20e15d2

Recover accidentally replaced train option while keeping new opt-in ref flag.\n\nAuthored by OpenClaw (model: gpt-5.3-codex)

test(parser): keep existing wrong-subcommand test and add allow-ref c…

5a6d724

…heck Restore accidentally replaced parser negative test and keep new allow-ref assertions.\n\nAuthored by OpenClaw (model: gpt-5.3-codex)

njzjz-bot changed the title ~~feat(train): support opt-in $ref loading (secure by default)~~ feat(train): support opt-in $ref loading Feb 24, 2026

Conversation

njzjz-bot commented Feb 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Security

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

njzjz-bot commented Feb 24, 2026

Uh oh!

njzjz-bot commented Feb 24, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

njzjz-bot commented Feb 24, 2026

Uh oh!

codecov bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

njzjz-bot commented Feb 24, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 24, 2026 •

edited

Loading

codecov bot commented Feb 24, 2026 •

edited

Loading