docs: add mathematical formulas to descriptor classes#5255
docs: add mathematical formulas to descriptor classes#5255njzjz-bot wants to merge 4 commits intodeepmodeling:masterfrom
Conversation
Add detailed mathematical formulas to the following descriptor classes: - DescrptSeR: radial descriptor with switching function - DescrptSeT: angular descriptor with cosine angles - DescrptSeTTebd: angular descriptor with type embedding - DescrptSeAttenV2: attention-based descriptor v2 - DescrptHybrid: concatenation of multiple descriptors - DescrptDPA2: repinit + repformer block equations - RepFlowArgs (DPA3): node/edge/angle update equations Follow numpydoc convention: parameters documented in class docstring, not in __init__ docstring. Co-authored-by: GLM-5 <glm-5@zhipuai.cn>
|
No actionable comments were generated in the recent review. 🎉 📝 WalkthroughWalkthroughAdds or expands class-level and inline docstrings across multiple descriptor classes in deepmd/dpmodel/descriptor, documenting mathematical formulations, parameters, and architectures. No runtime behavior, signatures, or control flow are changed. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/dpmodel/descriptor/repformers.py`:
- Around line 110-113: The docstring/formula is incorrect: update the final
descriptor description in DescrptBlockRepformers to state that the block returns
the iteratively updated single-atom representation g1 (G_1^i) after nlayers of
RepformerLayer iterations (dim_out == g1_dim) rather than the GRRG quadratic
symmetrization; move or clarify the GRRG formula as part of RepformerLayer
(where _cal_hg computes h2g2 and the rotation matrix) and note that h2g2/g2/h2
are used only to produce the rotation output, not the final descriptor returned
by DescrptBlockRepformers.call.
- Line 102: The formula comment at the G1xG1 term is wrong: it shows a
self-product \mathcal{G}_1^{i,l} \otimes \mathcal{G}_1^{i,l} whereas the
implementation in _update_g2_g1g1 computes an elementwise product between the
central atom's G1 and each neighbour's G1; update the formula to use the
neighbour index j (e.g. \mathcal{G}_2^{i,l+1} \leftarrow \mathcal{G}_2^{i,l} +
\mathrm{MLP}(\mathcal{G}_1^{i,l} \otimes \mathcal{G}_1^{j,l})) so the
documentation matches the behavior of _update_g2_g1g1.
- Around line 95-97: The doc/formula comments in repformers.py are incorrect:
swap the DRRD and GRRG mathematical descriptions so DRRD references G₁ (D) and
GRRG references G₂ (G), and update the convolution description to reflect the
actual implementation in _update_g1_conv (it aggregates neighbor features and
mixes with neighbor G₁ via the MLP/symmetrization_op). Concretely, change the
three lines so the convolution term describes MLP over aggregated neighbor G₂
mixed with neighbor G₁ (matching _update_g1_conv), the DRRD line uses
\mathcal{G}_1^{i,l} with the \odot \mathcal{R}^i pattern (matching
symmetrization_op(gg1,...)), and the GRRG line uses (\mathcal{G}_2^{i,l})^T
\mathcal{R}^i; ensure wording and variable names match symmetrization_op and
_update_g1_conv to avoid the swapped/mismatched references.
- Fix DRRD/GRRG term formulas to match implementation - Fix G1xG1 term to use neighbor index j - Fix final descriptor formula: returns G_1 directly, not symmetrization - Convolution term now shows neighbor aggregation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5255 +/- ##
==========================================
- Coverage 82.11% 82.00% -0.11%
==========================================
Files 749 750 +1
Lines 74904 75081 +177
Branches 3616 3615 -1
==========================================
+ Hits 61508 61572 +64
- Misses 12234 12347 +113
Partials 1162 1162 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The [1] references were added incorrectly without corresponding citation entries. Remove them to avoid broken references.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/dpmodel/descriptor/se_t.py`:
- Around line 68-73: The docstring formula for D^i is inconsistent with call():
update the documentation to match the implementation by (1) adding the
multiplicative outer factor env_{jk} (the smoothed directional scalar/vector
from EnvMat used as tilde{g}_{jk}) so the term reads env_{jk} *
N_{t_j,t_k}(env_{jk}) (res_ij is computed as sum_jk env_{jk} × N(env_{jk})), and
(2) correct the normalization to show per-type denominators 1/(N_{t_j} N_{t_k})
(matching the 1.0/float(nei_type_i)/float(nei_type_j) used in call()); reference
symbols: call(), res_ij, env_{jk}, EnvMat, nei_type_i, nei_type_j, and
N_{t_j,t_k} when editing the docstring so it matches the implemented formula.
- Around line 72-73: The docstring and math notation wrongly include the
central-atom type in the embedding subscript; update the math and any
explanatory text to use embeddings indexed only by neighbor type pairs (i.e.,
\mathcal{N}_{t_j,t_k}) to match NetworkCollection(ndim=2). In the code, verify
call() usage around embedding_idx (where it unpacks ti, tj) and ensure the
documentation and math block refer to \mathcal{N}_{t_j,t_k} (or similar
two-tuple notation) instead of \mathcal{N}_{t_i,t_j,t_k}, so the written
notation matches NetworkCollection, ndim=2, and the ti,tj unpacking logic.
- Correct embedding subscript to N_{t_j,t_k} (neighbor types only)
- Add outer factor env_{jk} in the sum
- Fix normalization to 1/(N_{t_j} N_{t_k}) per type pair
- Use smoothed directional vectors from EnvMat
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive mathematical formulas to descriptor class docstrings following numpydoc convention. The changes improve documentation by providing clear mathematical definitions of how each descriptor computes atomic representations.
Changes:
- Added mathematical formulas to 10 descriptor classes describing their computation methods
- Moved parameter documentation from
__init__methods to class-level docstrings for DPA-2 - Enhanced documentation with LaTeX equations, parameter descriptions, and architectural details
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| deepmd/dpmodel/descriptor/se_r.py | Added radial descriptor formula with switching function definition |
| deepmd/dpmodel/descriptor/se_t.py | Added angular descriptor formula with dot product computation |
| deepmd/dpmodel/descriptor/se_t_tebd.py | Added formulas for three-body descriptor and block class with type embedding modes |
| deepmd/dpmodel/descriptor/se_atten_v2.py | Added attention-based descriptor v2 formulas with stripped type embedding explanation |
| deepmd/dpmodel/descriptor/dpa1.py | Added attention-based descriptor block formulas with embedding matrix computation |
| deepmd/dpmodel/descriptor/dpa2.py | Moved documentation from __init__ to class level; added repinit/repformer equations |
| deepmd/dpmodel/descriptor/dpa3.py | Added repflow architecture formulas to RepFlowArgs and DescrptDPA3 classes |
| deepmd/dpmodel/descriptor/repformers.py | Added iterative update equations for single-atom, pair-atom, and equivariant representations |
| deepmd/dpmodel/descriptor/repflows.py | Added message passing equations for node, edge, and angle representations |
| deepmd/dpmodel/descriptor/hybrid.py | Added concatenation formula showing how sub-descriptors are combined |
Comments suppressed due to low confidence (5)
deepmd/dpmodel/descriptor/se_atten_v2.py:60
- Inconsistent parameter naming: the parameter name is 'rcut' but the description uses 'The cut-off radius'. Other similar parameters in the file use lowercase descriptions like 'The cut-off radius' without the math notation in the first line. Consider using 'The cut-off radius :math:
r_c' format consistently or just 'The cut-off radius' to match the style in other files.
The cut-off radius :math:`r_c`
deepmd/dpmodel/descriptor/se_atten_v2.py:62
- Inconsistent parameter description style: this parameter description starts with 'From where' which is grammatically awkward. Consider rephrasing to 'Where to start smoothing' to match the style used in other descriptor classes like DescrptSeTTebd.
From where the environment matrix should be smoothed :math:`r_s`
deepmd/dpmodel/descriptor/se_atten_v2.py:65
- Corrected spelling of 'maxmum' to 'maximum'.
sel : list[int], int
list[int]: sel[i] specifies the maxmum number of type i atoms in the cut-off radius
int: the total maxmum number of atoms in the cut-off radius
deepmd/dpmodel/descriptor/dpa2.py:395
- The phrase 'details information' should be 'detailed information' for grammatical correctness.
The arguments used to initialize the repinit block, see docstr in `RepinitArgs` for details information.
deepmd/dpmodel/descriptor/dpa2.py:397
- The phrase 'details information' should be 'detailed information' for grammatical correctness.
The arguments used to initialize the repformer block, see docstr in `RepformerArgs` for details information.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Fixed se_t.py formula:
Commit: ec78ace |
Summary
Add detailed mathematical formulas to descriptor class docstrings following numpydoc convention.
Changes
Added formulas to the following classes:
Main descriptor classes
Attention-based descriptors
DPA series
Block classes
Utility classes
Convention
Following numpydoc convention, parameters are documented in class docstrings, not in
__init__docstrings.Statistics
Authored by OpenClaw (model: GLM-5)
Summary by CodeRabbit