From 533888f12e13dea3d9a79d551d689b80da8bdb9a Mon Sep 17 00:00:00 2001 From: lmoresi Date: Wed, 17 Jun 2026 11:10:17 +1000 Subject: [PATCH] Lock the coord-mutation gate with a regression test; add lagged-clone SL design note MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two close-out items following PR #246 (the foolproof mesh-coordinate mutation gate): 1. tests/test_0826_deform_capability_gate.py — a dedicated regression test that the gate actually FIRES. PR #246's existing tests wrap calls to *avoid* the gate; none asserted that a bare _deform_mesh on a live mesh raises. This locks the user-facing protection: bare _deform_mesh raises (naming `deform`), mesh.deform() moves nodes, _coord_mutation() sanctions a direct call, and ephemeral_coords() restores on exit. level_1 + tier_a, runs in ~7s. 2. docs/developer/design/lagged-clone-sl-history.md — design note for the next phase of free-surface / adaptation work. The ALE v_mesh-pulse + CARRY history only APPROXIMATES "old field on old geometry"; at a receding free surface the departure point lands in the vacated layer and the new-mesh P3 extrapolates (the Ra=1e5 blow-up). Geometry-aware clamp / monotone band-aids only delay it (verified). The note specifies the cure: a lagged rolling-clone of the previous-step mesh (shared partition; stores primitives; recompute fluxes homogeneously) as the SL reference, with mixed-mesh expression evaluation as the Phase-1 enabling primitive. Unifies free-surface and adaptation. Underworld development team with AI support from Claude Code --- .../design/lagged-clone-sl-history.md | 171 ++++++++++++++++++ tests/test_0826_deform_capability_gate.py | 75 ++++++++ 2 files changed, 246 insertions(+) create mode 100644 docs/developer/design/lagged-clone-sl-history.md create mode 100644 tests/test_0826_deform_capability_gate.py diff --git a/docs/developer/design/lagged-clone-sl-history.md b/docs/developer/design/lagged-clone-sl-history.md new file mode 100644 index 00000000..da282b41 --- /dev/null +++ b/docs/developer/design/lagged-clone-sl-history.md @@ -0,0 +1,171 @@ +# Lagged reference-mesh SL history (and mixed-mesh evaluation) + +**Status:** design / plan (2026-06). Supersedes the ALE `v_mesh`-pulse + CARRY +approximation for the semi-Lagrangian history on a moving mesh. Two-phase plan: +**Phase 1 — mixed-mesh expression evaluation** (the enabling primitive, +implement and prove first); **Phase 2 — lagged rolling-clone SL history** +(the architecture built on Phase 1). + +Related: [REMESH_FIELD_TRANSFER_DESIGN](REMESH_FIELD_TRANSFER_DESIGN.md), +[in_memory_checkpoint_design](in_memory_checkpoint_design.md), +[submesh-solver-architecture](submesh-solver-architecture.md), +[STRESS_EQUILIBRIUM_FREESURFACE](STRESS_EQUILIBRIUM_FREESURFACE.md). + +## Problem + +On a moving mesh (free surface, or adaptation) the semi-Lagrangian update needs +the *old* field at the departure point. At a **receding** free surface +(downwellings) that point lies in the layer the surface just vacated — which the +*new* mesh does not cover. Sampling the new-mesh FE field there extrapolates the +P3 polynomial → ±300 node-level speckles → blow-up (Ra=1e5 annulus, ~step 20). + +The current code approximates "old field on old geometry" *without* keeping a +copy: history vars are `CARRY`'d onto the new mesh and the trace-back subtracts +`v_mesh = Δx/dt` (`ddt.py` `SemiLagrangian.update_pre_solve` / +`_activate_ale_for_traceback`). This is the approximation that fails: + +- The order-1 advected field has its `v_mesh` correction **gated off** at `i=0` + (the band-aid re-record path, `ddt.py:2447`), so its foot lands at the *old* + surface — in the vacated layer. +- The annulus `return_coords_to_bounds` (`meshing/annulus.py:507`) clamps to the + **construction-time circular radius**, not the deformed surface — so receded + feet aren't recognised as outside and aren't clamped. + +Confirmed empirically (`~/+Simulations/fs_convection_goal4/smoke_geomclamp.py`): +a geometry-aware clamp **delays** the blow-up ~2 steps and halves the early +overshoot but does **not** cure it — because the departure *value* is wrong, not +just the geometry. Clamp/monotone band-aids are dead ends; `monotone` does run on +the FE path (`functions_unit_system.py:854`) but fails via neighbour-corruption +feedback once a speckle seeds. + +## Target architecture: lagged rolling-clone + +Keep a **frozen clone of the previous-step mesh** (geometry + the primitive +fields needed for history). The SL trace-back samples *that* clone: +`global_evaluate(history_expr_on_clone, departure_coords)`. The departure point +sits in the receded layer, which the **old** mesh covers — so the sample is a +real interpolated value: no clamp, no extrapolation, no speckle. This is the +*exact* computation the `v_mesh`/CARRY apparatus only approximates. + +### Invariants + +1. **One clone, any scheme order.** Every trace-back is a single-Δt reach from + the current mesh to the previous-step mesh. BDF2 reaches 2Δt as two composed + single-Δt traces across successive steps (`psi_star[1]` is last step's + already-traced `psi_star[0]`). History *depth* lives in the **stack of + fields**, not in a stack of meshes. The clone is a **rolling** one-step-behind + copy, refreshed at the start of each step. +2. **Shared partition.** The clone must share topology *and* the exact parallel + partition / local numbering (PETSc `DMClone`-style: shared DMPlex + `PetscSF`, + **independent** coordinate vector). Then refreshing the clone's geometry each + step is a **local memcpy of coordinates, no global scatter**. (Sampling at + launch points is `global_evaluate` — parallel-routed and partition-agnostic; + shared partition lets it reuse routing.) +3. **The clone stores primitives, not derived tensors.** Store `T` (and `V`, and + stress for VE) at the lagged time; **recompute** gradients / fluxes / stress + at full FE accuracy on the clone. This retires the lossy projected histories + (`DFDt` flux history, the `psi_star_flat` tensor view, the VE stress + projection) and their projection diffusion. +4. **Mesh-homogeneous history; the mixing guard defines the clone set.** Each + history term is an expression evaluated **entirely on the clone**; the only + thing crossing the mesh boundary is a value **array at coordinates**, never a + symbolic expression spanning two DMs. The evaluator's existing + refuse-mixed-mesh guard is the *enforcement*: if a history build trips it, + that names a current-mesh field that must be cloned to the lagged mesh. Time- + splitting (BDF/AM) already separates terms by time level, so "no mixed mesh" + and "no mixed time level in one expression" become the same discipline. +5. **Split the implicit `self.mesh`.** Today `update_pre_solve` assumes one mesh + everywhere (`get_closest_cells`, `_centroids`, `return_coords_to_bounds`, the + `global_evaluate` calls). The refactor makes two roles explicit: + *foot computation* (where did the material come from) on the **current** mesh; + *history sampling* (value/flux there) on the **lagged** clone. Every + evaluation takes its mesh **from the variable/expression**, not a default. +6. **Unifies free-surface and adaptation.** The lagged clone is the SL reference + regardless of *how* nodes moved (free surface, mmpde, OT). Same code path — + the original goal of this work. + +## Phase 1 — mixed-mesh expression evaluation (implement first) + +The enabling primitive, and the efficiency-critical piece. Two implementation +strategies; Phase 1 likely needs both, chosen per term: + +- **(a) Clone-the-primitives** → expression is homogeneous on the clone. Simple + per-eval; costs memory + per-step field copy. Good default for history terms. +- **(b) True mixed-mesh evaluate** → evaluate each meshVariable on *its own* DM + at the shared physical coordinates, then combine the per-mesh arrays pointwise. + No field cloning; the harder/efficiency-critical path. + +### Semantics of (b) + +For an expression `E` whose meshVariable atoms belong to meshes `{M_a, M_b, …}`, +at physical points `X`: value at `x_k` = `E(var_a(x_k|M_a), var_b(x_k|M_b), …)`. +Pointwise; derivatives are local to each variable's own mesh, so they're fine. + +### Mechanism + +1. Partition the (unwrapped) expression's atoms by owning mesh. +2. For each mesh `M_i`, point-locate `X` and interpolate the needed + variable/derivative components → array `A_i`. +3. Substitute the `A_i` into the symbolic combination and evaluate pointwise + (existing numpy/lambdify path). + +### Efficiency (the crux) + +- **Per-mesh location cache.** Point location dominates. Cache per mesh, keyed on + coord-hash (cf. `_dminterpolation_cache`, see + [data-access](../subsystems/data-access.md)); reuse across all history levels + evaluated at the same `X` within one step. +- **Shared-partition fast path.** When `M_b` is a partition-sharing clone of + `M_a`, a point owned by rank *r* on `M_a` is owned by rank *r* on `M_b` — one + routing, two local interpolations; no second scatter. This is *why* the clone + shares the partition. +- **Reuse the locator across the BDF stack** (all `psi_star[i]` share `X`). + +### API + +- Keep refuse-mixed-mesh as the **default** (accidental mixing still caught). +- Add an **explicit opt-in** (e.g. `evaluate(..., allow_mixed_mesh=True)` or a + dedicated `multi_mesh_evaluate`) for intentional history evaluation. +- Caveat to verify up front: `global_evaluate` of **derivative / vector** + expressions on a *non-current* mesh has a known wrinkle on some branches + (`scalar_component` mismatch). The gradient-recompute path depends entirely on + this — confirm/fix it as the first Phase-1 task. + +### Tests + +Correctness vs per-mesh manual eval; the shared-partition fast path +(bit-identical, reduced routing); parallel (np>1); derivative/vector exprs across +meshes. + +## Phase 2 — lagged rolling-clone SL history + +1. **Clone primitive:** `DMClone`-with-shared-partition + independent coords; + rolling refresh (local coord + field copy at step start). Reuse the in-memory + snapshot toolkit if it already provides same-partition state hold. +2. **DDt refactor:** split foot-mesh (current) vs history-mesh (clone); sample + history via Phase-1 evaluation; recompute fluxes/gradients homogeneously on + the clone. +3. **Retire** the `CARRY` + `v_mesh`-pulse + `i=0`-gate machinery for the trace- + back (scope what else still depends on `CARRY` before deleting). +4. **Validate:** the free-surface smoke test reaches approximate equilibrium + (T∈[0,1], no speckle); adaptation regression unchanged; bit-identical on a + non-moving mesh. + +## Open questions / risks + +- Derivative/vector `global_evaluate` on a non-current mesh (Phase-1 task 1). +- What else depends on `CARRY` / the ALE pulse (swarm advection? VE?). +- Memory/perf of the rolling clone at scale; locator rebuild vs reuse. +- VE: the stress history needs lagged `V` (and `∇V`) on the clone — confirm the + clone primitive set covers the constitutive expression's full free-variable + set (the mixing guard enumerates it). + +## Relationship to landed work + +- **PR #246** (capability gate + `Mesh.deform` + `ephemeral_coords` + SL-field + `CARRY` auto-stamp) made coordinate mutation foolproof and proved the failure + is *not* transfer-incoherence — it set up this design. The `CARRY` auto-stamp + is part of the apparatus this design will eventually retire. +- The **big question** this unblocks: SL height field vs full node movement + + mmpde. The lagged-clone approach makes SL correct at a moving boundary; the + comparison is then about cost/robustness, not correctness. diff --git a/tests/test_0826_deform_capability_gate.py b/tests/test_0826_deform_capability_gate.py new file mode 100644 index 00000000..a5cbf84b --- /dev/null +++ b/tests/test_0826_deform_capability_gate.py @@ -0,0 +1,75 @@ +"""Regression test for the mesh-coordinate-mutation capability gate (PR #246). + +`Mesh._deform_mesh` is the raw internal primitive: it moves nodes WITHOUT +transferring fields or SL/DDt history. On a *live* mesh (one carrying variables +or remesh hooks) a bare call corrupts the solution, so it is gated: it must be +reached through a sanctioned path — + + * ``mesh.deform(new_coords, dt=...)`` — arbitrary displacement (transfer-aware) + * ``with mesh._coord_mutation(): mesh._deform_mesh(...)`` — trusted internal + * ``with mesh.ephemeral_coords(): ...`` — trial deform, restored on exit + +This test LOCKS that behaviour so the user-facing protection can't silently +regress. See docs/developer/design/REMESH_FIELD_TRANSFER_DESIGN.md and +docs/developer/design/lagged-clone-sl-history.md. +""" + +import numpy as np +import pytest + +import underworld3 as uw + +pytestmark = [pytest.mark.level_1, pytest.mark.tier_a] + + +def _live_mesh(): + """A small mesh carrying a variable (so the gate is armed).""" + mesh = uw.meshing.StructuredQuadBox( + elementRes=(6, 6), minCoords=(0.0, 0.0), maxCoords=(1.0, 1.0)) + s = uw.discretisation.MeshVariable("S_gate", mesh, 1, degree=1, continuous=True) + s.data[:, 0] = 1.0 + return mesh, s + + +def _perturbed(mesh): + coords = np.asarray(mesh.X.coords) + p = coords.copy() + p[:, 0] += 1.0e-3 * coords[:, 1] + return p + + +def test_bare_deform_mesh_on_live_mesh_raises(): + """The gate fires: a bare _deform_mesh on a live mesh raises, naming the + sanctioned public entry point.""" + mesh, _ = _live_mesh() + with pytest.raises(RuntimeError, match="deform"): + mesh._deform_mesh(_perturbed(mesh)) + + +def test_public_deform_succeeds_and_moves_nodes(): + """mesh.deform(...) is the sanctioned path and actually moves the nodes.""" + mesh, _ = _live_mesh() + before = np.asarray(mesh.X.coords).copy() + mesh.deform(_perturbed(mesh), dt=1.0) + after = np.asarray(mesh.X.coords) + assert not np.allclose(before, after), "deform() should move the coordinates" + + +def test_coord_mutation_scope_sanctions_direct_call(): + """The internal _coord_mutation() scope lets a deliberate direct call pass + the gate.""" + mesh, _ = _live_mesh() + with mesh._coord_mutation(): + mesh._deform_mesh(_perturbed(mesh)) # must not raise + + +def test_ephemeral_coords_restores_on_exit(): + """ephemeral_coords() allows a trial deform and restores coords on exit.""" + mesh, _ = _live_mesh() + before = np.asarray(mesh.X.coords).copy() + with mesh.ephemeral_coords(): + mesh._deform_mesh(_perturbed(mesh)) + moved = np.asarray(mesh.X.coords) + assert not np.allclose(before, moved), "trial deform should move coords" + after = np.asarray(mesh.X.coords) + assert np.allclose(before, after), "ephemeral_coords must restore coords on exit"