Skip to content

docs: add mathematical formulas to descriptor classes#5255

Queued
njzjz-bot wants to merge 4 commits intodeepmodeling:masterfrom
njzjz-bot:add-descriptor-formulas-v2
Queued

docs: add mathematical formulas to descriptor classes#5255
njzjz-bot wants to merge 4 commits intodeepmodeling:masterfrom
njzjz-bot:add-descriptor-formulas-v2

Conversation

@njzjz-bot
Copy link
Contributor

@njzjz-bot njzjz-bot commented Feb 23, 2026

Summary

Add detailed mathematical formulas to descriptor class docstrings following numpydoc convention.

Changes

Added formulas to the following classes:

Main descriptor classes

  • DescrptSeR: Radial descriptor with switching function formula
  • DescrptSeT: Angular descriptor with cosine angle formula
  • DescrptSeTTebd: Angular descriptor with type embedding
  • DescrptSeA: Already had formulas ✓
  • DescrptDPA1: Already had formulas ✓

Attention-based descriptors

  • DescrptSeAttenV2: Attention-based descriptor v2 with stripped type embedding
  • DescrptBlockSeAtten: Attention-based descriptor block

DPA series

  • DescrptDPA2: Repinit + repformer block equations
  • DescrptDPA3: Repflow block equations
  • RepFlowArgs: Node/edge/angle update equations

Block classes

  • DescrptBlockSeTTebd: Three-body descriptor block with type embedding
  • DescrptBlockRepformers: Repformer block with iterative updates
  • DescrptBlockRepflows: Repflow block with message passing

Utility classes

  • DescrptHybrid: Concatenation of multiple descriptors

Convention

Following numpydoc convention, parameters are documented in class docstrings, not in __init__ docstrings.

Statistics

  • 10 files changed
  • 464 insertions(+)
  • 55 deletions(-)

Authored by OpenClaw (model: GLM-5)

Summary by CodeRabbit

  • Documentation
    • Expanded and clarified class- and module-level docstrings across descriptor implementations with mathematical formulations, parameter meanings, and usage notes.
    • Moved lengthy constructor remarks into class-level documentation for clearer public-facing explanations.
    • Detailed per-block behaviors, output dimensionality, aggregation/symmetrization steps, and repflow/repformer architectural descriptions.

Add detailed mathematical formulas to the following descriptor classes:
- DescrptSeR: radial descriptor with switching function
- DescrptSeT: angular descriptor with cosine angles
- DescrptSeTTebd: angular descriptor with type embedding
- DescrptSeAttenV2: attention-based descriptor v2
- DescrptHybrid: concatenation of multiple descriptors
- DescrptDPA2: repinit + repformer block equations
- RepFlowArgs (DPA3): node/edge/angle update equations

Follow numpydoc convention: parameters documented in class docstring,
not in __init__ docstring.

Co-authored-by: GLM-5 <glm-5@zhipuai.cn>
@dosubot
Copy link

dosubot bot commented Feb 23, 2026

Related Documentation

Checked 0 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@dosubot dosubot bot added the Docs label Feb 23, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 23, 2026

No actionable comments were generated in the recent review. 🎉


📝 Walkthrough

Walkthrough

Adds or expands class-level and inline docstrings across multiple descriptor classes in deepmd/dpmodel/descriptor, documenting mathematical formulations, parameters, and architectures. No runtime behavior, signatures, or control flow are changed.

Changes

Cohort / File(s) Summary
DPA descriptors
deepmd/dpmodel/descriptor/dpa1.py, deepmd/dpmodel/descriptor/dpa2.py, deepmd/dpmodel/descriptor/dpa3.py
Added extensive class- and constructor-level documentation for DPA-1/2/3 descriptors (purpose, math, parameter meanings). Moved/removed duplicate constructor docstring in dpa2.py. No logic changes.
SE descriptors (radial/angle/attention/TEBD)
deepmd/dpmodel/descriptor/se_r.py, deepmd/dpmodel/descriptor/se_t.py, deepmd/dpmodel/descriptor/se_atten_v2.py, deepmd/dpmodel/descriptor/se_t_tebd.py
Inserted detailed mathematical docstrings describing embeddings, switching functions, angular computations, attention formulation, and TEBD-related classes. No functional edits.
Rep architectures & hybrid
deepmd/dpmodel/descriptor/repflows.py, deepmd/dpmodel/descriptor/repformers.py, deepmd/dpmodel/descriptor/hybrid.py
Expanded class-level documentation for RepFlows, Repformers, and the Hybrid concatenation descriptor, specifying representation updates, layer equations, and output composition. Implementation unchanged.
Misc / manifest
pyproject.toml, requirements.txt, setup.py, manifest_file
Minor manifest/file metadata touched (listed in summary). No code behavior changes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • iProzd
  • wanghan-iapcm
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'docs: add mathematical formulas to descriptor classes' accurately and concisely summarizes the main change: adding mathematical documentation to descriptor class docstrings across multiple files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/dpmodel/descriptor/repformers.py`:
- Around line 110-113: The docstring/formula is incorrect: update the final
descriptor description in DescrptBlockRepformers to state that the block returns
the iteratively updated single-atom representation g1 (G_1^i) after nlayers of
RepformerLayer iterations (dim_out == g1_dim) rather than the GRRG quadratic
symmetrization; move or clarify the GRRG formula as part of RepformerLayer
(where _cal_hg computes h2g2 and the rotation matrix) and note that h2g2/g2/h2
are used only to produce the rotation output, not the final descriptor returned
by DescrptBlockRepformers.call.
- Line 102: The formula comment at the G1xG1 term is wrong: it shows a
self-product \mathcal{G}_1^{i,l} \otimes \mathcal{G}_1^{i,l} whereas the
implementation in _update_g2_g1g1 computes an elementwise product between the
central atom's G1 and each neighbour's G1; update the formula to use the
neighbour index j (e.g. \mathcal{G}_2^{i,l+1} \leftarrow \mathcal{G}_2^{i,l} +
\mathrm{MLP}(\mathcal{G}_1^{i,l} \otimes \mathcal{G}_1^{j,l})) so the
documentation matches the behavior of _update_g2_g1g1.
- Around line 95-97: The doc/formula comments in repformers.py are incorrect:
swap the DRRD and GRRG mathematical descriptions so DRRD references G₁ (D) and
GRRG references G₂ (G), and update the convolution description to reflect the
actual implementation in _update_g1_conv (it aggregates neighbor features and
mixes with neighbor G₁ via the MLP/symmetrization_op). Concretely, change the
three lines so the convolution term describes MLP over aggregated neighbor G₂
mixed with neighbor G₁ (matching _update_g1_conv), the DRRD line uses
\mathcal{G}_1^{i,l} with the \odot \mathcal{R}^i pattern (matching
symmetrization_op(gg1,...)), and the GRRG line uses (\mathcal{G}_2^{i,l})^T
\mathcal{R}^i; ensure wording and variable names match symmetrization_op and
_update_g1_conv to avoid the swapped/mismatched references.

- Fix DRRD/GRRG term formulas to match implementation
- Fix G1xG1 term to use neighbor index j
- Fix final descriptor formula: returns G_1 directly, not symmetrization
- Convolution term now shows neighbor aggregation
@codecov
Copy link

codecov bot commented Feb 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.00%. Comparing base (5eb5400) to head (ec78ace).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5255      +/-   ##
==========================================
- Coverage   82.11%   82.00%   -0.11%     
==========================================
  Files         749      750       +1     
  Lines       74904    75081     +177     
  Branches     3616     3615       -1     
==========================================
+ Hits        61508    61572      +64     
- Misses      12234    12347     +113     
  Partials     1162     1162              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

The [1] references were added incorrectly without corresponding
citation entries. Remove them to avoid broken references.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/dpmodel/descriptor/se_t.py`:
- Around line 68-73: The docstring formula for D^i is inconsistent with call():
update the documentation to match the implementation by (1) adding the
multiplicative outer factor env_{jk} (the smoothed directional scalar/vector
from EnvMat used as tilde{g}_{jk}) so the term reads env_{jk} *
N_{t_j,t_k}(env_{jk}) (res_ij is computed as sum_jk env_{jk} × N(env_{jk})), and
(2) correct the normalization to show per-type denominators 1/(N_{t_j} N_{t_k})
(matching the 1.0/float(nei_type_i)/float(nei_type_j) used in call()); reference
symbols: call(), res_ij, env_{jk}, EnvMat, nei_type_i, nei_type_j, and
N_{t_j,t_k} when editing the docstring so it matches the implemented formula.
- Around line 72-73: The docstring and math notation wrongly include the
central-atom type in the embedding subscript; update the math and any
explanatory text to use embeddings indexed only by neighbor type pairs (i.e.,
\mathcal{N}_{t_j,t_k}) to match NetworkCollection(ndim=2). In the code, verify
call() usage around embedding_idx (where it unpacks ti, tj) and ensure the
documentation and math block refer to \mathcal{N}_{t_j,t_k} (or similar
two-tuple notation) instead of \mathcal{N}_{t_i,t_j,t_k}, so the written
notation matches NetworkCollection, ndim=2, and the ti,tj unpacking logic.

- Correct embedding subscript to N_{t_j,t_k} (neighbor types only)
- Add outer factor env_{jk} in the sum
- Fix normalization to 1/(N_{t_j} N_{t_k}) per type pair
- Use smoothed directional vectors from EnvMat
@njzjz njzjz requested a review from Copilot February 23, 2026 04:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive mathematical formulas to descriptor class docstrings following numpydoc convention. The changes improve documentation by providing clear mathematical definitions of how each descriptor computes atomic representations.

Changes:

  • Added mathematical formulas to 10 descriptor classes describing their computation methods
  • Moved parameter documentation from __init__ methods to class-level docstrings for DPA-2
  • Enhanced documentation with LaTeX equations, parameter descriptions, and architectural details

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.

Show a summary per file
File Description
deepmd/dpmodel/descriptor/se_r.py Added radial descriptor formula with switching function definition
deepmd/dpmodel/descriptor/se_t.py Added angular descriptor formula with dot product computation
deepmd/dpmodel/descriptor/se_t_tebd.py Added formulas for three-body descriptor and block class with type embedding modes
deepmd/dpmodel/descriptor/se_atten_v2.py Added attention-based descriptor v2 formulas with stripped type embedding explanation
deepmd/dpmodel/descriptor/dpa1.py Added attention-based descriptor block formulas with embedding matrix computation
deepmd/dpmodel/descriptor/dpa2.py Moved documentation from __init__ to class level; added repinit/repformer equations
deepmd/dpmodel/descriptor/dpa3.py Added repflow architecture formulas to RepFlowArgs and DescrptDPA3 classes
deepmd/dpmodel/descriptor/repformers.py Added iterative update equations for single-atom, pair-atom, and equivariant representations
deepmd/dpmodel/descriptor/repflows.py Added message passing equations for node, edge, and angle representations
deepmd/dpmodel/descriptor/hybrid.py Added concatenation formula showing how sub-descriptors are combined
Comments suppressed due to low confidence (5)

deepmd/dpmodel/descriptor/se_atten_v2.py:60

  • Inconsistent parameter naming: the parameter name is 'rcut' but the description uses 'The cut-off radius'. Other similar parameters in the file use lowercase descriptions like 'The cut-off radius' without the math notation in the first line. Consider using 'The cut-off radius :math:r_c' format consistently or just 'The cut-off radius' to match the style in other files.
            The cut-off radius :math:`r_c`

deepmd/dpmodel/descriptor/se_atten_v2.py:62

  • Inconsistent parameter description style: this parameter description starts with 'From where' which is grammatically awkward. Consider rephrasing to 'Where to start smoothing' to match the style used in other descriptor classes like DescrptSeTTebd.
            From where the environment matrix should be smoothed :math:`r_s`

deepmd/dpmodel/descriptor/se_atten_v2.py:65

  • Corrected spelling of 'maxmum' to 'maximum'.
    sel : list[int], int
            list[int]: sel[i] specifies the maxmum number of type i atoms in the cut-off radius
            int: the total maxmum number of atoms in the cut-off radius

deepmd/dpmodel/descriptor/dpa2.py:395

  • The phrase 'details information' should be 'detailed information' for grammatical correctness.
        The arguments used to initialize the repinit block, see docstr in `RepinitArgs` for details information.

deepmd/dpmodel/descriptor/dpa2.py:397

  • The phrase 'details information' should be 'detailed information' for grammatical correctness.
        The arguments used to initialize the repformer block, see docstr in `RepformerArgs` for details information.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@njzjz-bot
Copy link
Contributor Author

Fixed se_t.py formula:

  • Embedding subscript changed to N_{t_j,t_k} (neighbor types only)
  • Added outer factor env_{jk}
  • Corrected normalization to 1/(N_{t_j} N_{t_k})

Commit: ec78ace

@iProzd iProzd added this pull request to the merge queue Feb 25, 2026
@iProzd iProzd removed this pull request from the merge queue due to a manual request Feb 25, 2026
@njzjz njzjz added this pull request to the merge queue Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants