Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 47 additions & 1 deletion doc/model/dpa2.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,52 @@ The DPA-2 model implementation. See [DPA-2 paper](https://doi.org/10.1038/s41524

Training example: `examples/water/dpa2/input_torch_medium.json`, see [README](../../examples/water/dpa2/README.md) for inputs in different levels.

## Theory

DPA-2 is an attention-based descriptor architecture proposed for large atomic models (LAMs); see the [DPA-2 paper](https://doi.org/10.1038/s41524-024-01493-2).

At a high level, DPA-2 builds local representations with three coupled channels (paper notation):

- **Single-atom channel** $\mathbf{f}_\alpha$
- **Rotationally invariant pair channel** $\mathbf{g}_{\alpha\beta}$
- **Rotationally equivariant pair channel** $\mathbf{h}_{\alpha\beta}$

for neighbors $\beta\in\mathcal{N}(\alpha)$ within cutoffs.

### Descriptor pipeline

The descriptor follows two main stages:

1. **repinit (representation initializer)**
- Initializes and fuses type and geometry information from local environments.
1. **repformer (representation transformer)**
- Stacked message-passing layers that update $\mathbf{f}$, $\mathbf{g}$, and per-atom representations $\mathbf{h}$ through convolution/symmetrization/MLP and attention-style interactions.

The final descriptor is formed from learned single-atom representations and then passed to downstream fitting/model components.

### Message-passing intuition

DPA-2 updates local features layer-by-layer with residual connections. Conceptually, each layer performs neighborhood aggregation using geometry-conditioned interactions:

```math
\mathbf{h}_\alpha^{(l+1)} = \mathbf{h}_\alpha^{(l)} + \mathrm{MP}^{(l)}\left(\mathbf{h}_\alpha^{(l)}, \{\mathbf{h}_\beta^{(l)}\}_{\beta\in\mathcal{N}(\alpha)}, \{\mathbf{g}_{\alpha\beta}\}_{\beta\in\mathcal{N}(\alpha)}\right)
```

where $\mathrm{MP}^{(l)}$ denotes the layer-specific message-passing operator.

### Physical properties

Consistent with the DPA-2 design goals in the paper, the model family is built to satisfy:

1. **Translational invariance** (depends on relative coordinates)
1. **Rotational and permutational symmetry requirements**
1. **Conservative formulation** when used in energy models (forces/virials from energy gradients)
1. **Smoothness up to second-order derivatives**

### Multi-task training context

DPA-2 is designed for multi-task pre-training with a shared descriptor and task-specific downstream heads. See [Multi-task training](../train/multi-task-training.md) for workflow details.

## Requirements of installation {{ pytorch_icon }}

If one wants to run the DPA-2 model on LAMMPS, the customized OP library for the Python interface must be installed when [freezing the model](../freeze/freeze.md).
Expand All @@ -18,7 +64,7 @@ If one runs LAMMPS with MPI, the customized OP library for the C++ interface sho
If one runs LAMMPS with MPI and CUDA devices, it is recommended to compile the customized OP library for the C++ interface with a [CUDA-Aware MPI](https://developer.nvidia.com/mpi-solutions-gpus) library and CUDA,
otherwise the communication between GPU cards falls back to the slower CPU implementation.

## Limiations of the JAX backend with LAMMPS {{ jax_icon }}
## Limitations of the JAX backend with LAMMPS {{ jax_icon }}

When using the JAX backend, 2 or more MPI ranks are not supported. One must set `map` to `yes` using the [`atom_modify`](https://docs.lammps.org/atom_modify.html) command.

Expand Down