refactor: unify as_dataarray; split broadcasting from coords validation#726
Merged
FBumann merged 8 commits intoMay 27, 2026
Merged
Conversation
…onstraints
Routes ``mask`` through ``as_dataarray_in_coords(mask, data.coords)``
instead of ``as_dataarray(...) + broadcast_mask(...)``, so pandas
``Series`` / ``DataFrame`` masks missing a dimension are broadcast
to the variable / constraint shape (parallel to the bounds fix in
the previous PR). The ``add_variables`` ``mask`` type hint widens
to ``MaskLike`` to match ``add_constraints``.
The deprecation announced via ``FutureWarning`` in ``broadcast_mask``
("Missing values will be filled with False ... In a future version,
this will raise an error") is now in effect: masks whose
coordinates are a sparse subset of the data's coordinates raise
``ValueError`` instead of silently filling missing entries.
Mask dims not in the data raise ``ValueError`` instead of
``AssertionError`` for consistency with the bounds path.
``broadcast_mask`` had no other callers and is removed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes #723. Folds the body of `as_dataarray_in_coords` into `as_dataarray` and extracts the contract checks into `assert_compatible_with_coords`, so linopy now has one broadcasting primitive and one validation companion. `as_dataarray(arr, coords)` aligns the result against `coords` for every input type: labels positional inputs (numpy / unnamed pandas / scalar) by position, reindexes same-values-different-order, expands missing dims, and transposes to coords order. Extra dims and disagreeing value sets on shared dims pass through unchanged, so xarray broadcasting in expression arithmetic keeps working. `assert_compatible_with_coords(arr, coords)` enforces the strict contract (`arr.dims ⊆ coords.dims`, plus exact coord-value equality on shared dims). `add_variables` and `add_constraints` now call it after `as_dataarray` for `lower` / `upper` / `mask`, replacing the deleted `as_dataarray_in_coords` helper. `_coords_to_dict` filters MultiIndex level coords out of `xarray.Coordinates` inputs so the new strict-by-default path treats `station` (and not its derived `letter` / `num` levels) as the dim. Test suite: 3698 passed (no regressions). Two existing tests were updated to reflect the new "coords is source of truth" semantics: `test_as_dataarray_with_ndarray_coords_dict_set_dims_not_aligned` (extra coord entries now broadcast in) and `test_dataarray_extra_dims` (now triggers the subset check rather than the value-mismatch check). Microbenchmark in dev-scripts/benchmark_as_dataarray.py shows flat timings vs the base branch on both add_variables-heavy and arithmetic- heavy workloads. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes a silent-failure gap in the strict coords-as-truth path: when the
caller passed ``coords=[[1, 2, 3]], dims=["x"]`` to ``add_variables``,
``_coords_to_dict`` returned an empty mapping (unnamed sequences carry
no dim name), so the strict checks short-circuited and bounds with
extra dims or mismatched values flowed through unchecked, producing
variables with frankenstein outer-joined coord values.
``_coords_to_dict`` now accepts an optional ``dims`` argument that
names unnamed sequence entries by position. ``as_dataarray`` and
``assert_compatible_with_coords`` plumb it through; ``add_variables``
forwards ``kwargs.get("dims")`` to the assertions for ``lower`` and
``upper``. ``coords=[[1, 2, 3]], dims=["x"]`` now enforces the same
contract as ``coords={"x": [1, 2, 3]}`` or
``coords=[pd.Index([1, 2, 3], name="x")]``.
Docstring of ``add_variables.coords`` documents the contract
(subset-of-dims, dim order, value match with auto-reindex, missing-dim
broadcast) and includes four doctests pinning it: the extra-dim raise,
the value-mismatch raise, the same-values-different-order auto-reindex,
and the unnamed-coords-plus-dims opt-in.
Test suite: 3698 passed (parity with the previous commit on this
branch). ``pytest --doctest-modules linopy/model.py -k add_variables``
also green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce align_to_coords to wrap as_dataarray and assert_compatible_with_coords with user-facing labels (lower bound, upper bound, mask). Errors now name the argument and distinguish extra dimensions, coordinate mismatches, and conversion failures. Extend mask validation to use coords+dims= when provided. Co-authored-by: Cursor <cursoragent@cursor.com>
for more information, see https://pre-commit.ci
…coords Three cleanups on top of align_to_coords: - Drop the trailing ``.broadcast_like(data.labels)`` in ``add_variables`` and ``add_constraints`` mask paths. ``as_dataarray`` already expands missing dims to ``coords`` shape, so the broadcast was a no-op. - Stop overriding the caller's ``dims=`` in the ``add_variables`` mask path when ``coords is None``. The previous code stripped ``dims`` and forced ``dims=data.dims``; with ``data.coords`` being an xarray ``Coordinates`` with already-named dims, the user's ``dims`` is harmless to forward and the override was just hiding intent. Mask now goes through one ``align_to_coords`` call regardless of whether ``coords`` is supplied. - Split the exception handler in ``align_to_coords``: ``TypeError`` from unsupported input types is re-raised as ``TypeError`` (still labeled), while ``ValueError`` / ``CoordinateValidationError`` stay ``ValueError``. Preserves the original type signature for callers that want to ``except TypeError``. New test ``test_align_to_coords_preserves_type_errors`` pins the TypeError pass-through. Suite: 3703 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Base automatically changed from
fix/mask-coords-broadcast
to
fix/bounds-coords-broadcast
May 27, 2026 08:57
Collaborator
|
@FBumann we have some conflicts here now, should I resolve them? I would still need to figure out where things branch off currently |
Collaborator
Author
|
I can |
FBumann
added a commit
that referenced
this pull request
May 27, 2026
…_validate_update Constraint - sanitize_zeros now writes _data via _assign_data directly (no longer round-trips through update(variables=DataArray), which would self-trigger the new deprecation warning). - Constraint.update(variables=...) emits FutureWarning when passed a raw DataArray of integer labels; Variable is the supported input. The path stays accepted for back-compat and will be removed alongside the v1 cleanup. Variable - Extract Variable._validate_update(*, lower, upper) — validates, broadcasts, and runs the cross-bound (lb<=ub) check, returning a dict ready for assignment. update() body shrinks to ~3 lines. Coord validation (parity with add_variables) deferred to #726 land. Tests - test_constraint_coef_dirty's variables= test now passes a Variable instead of a DataArray (matches the supported input). - test_constraint_vars_setter_with_array wrapped in pytest.warns(FutureWarning) — locks in the deprecation contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…o feat/unify-as-dataarray-coords # Conflicts: # doc/release_notes.rst # linopy/common.py # linopy/model.py # test/test_constraints.py # test/test_variables.py
FabianHofmann
pushed a commit
that referenced
this pull request
May 27, 2026
* feat: Variable.update() / Constraint.update() as canonical mutation API Introduces typed ``.update()`` methods on Variable and Constraint as the single, validated, multi-attribute mutation entry point. - ``Variable.update(lower=, upper=)``: validates non-constant inputs are rejected, new dims are rejected, and the resulting ``lower <= upper`` invariant holds across all coords. Returns ``self`` for chaining. - ``Constraint.update(rhs=, sign=)``: constant RHS only. The legacy ``c.rhs = variable`` rearrange-to-lhs path stays on the setter (different semantic, deserves its own explicit method). The existing ``.lower`` / ``.upper`` / ``.sign`` setters become thin shims that forward to ``.update()``, so single-attribute writes (``z.lower = 2``) stay ergonomic and the canonical validation runs in one place. The ``.rhs`` setter forwards constants through ``.update()`` and keeps the expression-rhs rearrange behaviour. This is the on-top experiment for the design discussion on #718. ``.lhs`` / ``.coeffs`` / ``.vars`` setters are untouched for now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(update): Constraint.update accepts Variable/Expression rhs Mirrors the existing ``c.rhs = expr`` setter and ``add_constraints`` which both accept mixed-side input and rearrange the residual onto lhs. ``c.update(rhs=x + 5)`` now subtracts ``x`` from lhs and stores ``5`` on rhs. ``.rhs`` setter collapses to a one-line shim. Variable bound rejection of Variable/Expression is kept (bounds are numeric, not symbolic); docstring clarified to spell out that pandas / xarray / numpy arrays are first-class (time-varying bounds). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(update): extend Constraint.update to lhs/coeffs/vars; shim all setters Adds lhs / coeffs / vars to the canonical mutation API. All .lhs / .coeffs / .vars setters now forward to .update() — every Constraint mutation goes through one method with one validation path, one place that flips _coef_dirty. Composition rules: - lhs= replaces the whole expression first; subsequent rhs= rearrangement (Variable/Expression in rhs) sees the new lhs. - lhs= and coeffs= / vars= are mutually exclusive (whole replacement vs partial array update). - sign= is applied last so it composes cleanly. Internal Constraint.sanitize_zeros migrated to update(vars=, coeffs=) — no more internal setter calls in linopy/. 389 tests pass across mutation + persistent-solver suite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(update): rename Constraint.update kwarg vars= -> variables= Avoids shadowing Python's vars() builtin. The .vars attribute on Constraint stays (it parallels the .data.vars internal name); only the kwarg gets the unambiguous spelling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(update): accept positional ConstraintLike in Constraint.update Mirrors add_constraints' dispatch: c.update(x + 5 <= 3) is now shorthand for c.update(lhs=x, sign='<=', rhs=-2), extracted from the AnonymousConstraint / ConstraintBase the comparison produces. Mutually exclusive with the per-attribute kwargs; clear error when mixed. Also reverts the internal sanitize_zeros migration. The setters are pure shims forwarding to update(), so the migration didn't change behaviour or cost — just spelling. The original setter syntax reads more naturally there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(update): note kwarg form is the targeted, cheap path The positional ConstraintLike form (c.update(x + 5 <= 3)) always rewrites lhs / sign / rhs and flips _coef_dirty. For hot loops that only touch one part, kwarg form (c.update(rhs=...)) skips the unchanged attributes and is materially cheaper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(persistent): default ModelDiff.from_snapshot(same_model=False) Closes the A1 residual from the #718 review. The flag-trust path (`skip_coef_compare = same_model and not coef_dirty`) is correct through Constraint.update() (set in one place, shims forward), but `c.coeffs.values[...] = ...` still bypasses _coef_dirty. With same_model=True as the default, that bypass silently produces wrong diffs. Flip the default to False. Cross-model paths (the only production caller, Solver._update_locked, passes explicitly) are unaffected. Same-model warm-update paths now value-diff the CSR data — small perf hit (50-200ms at Mayk-scale per Mayk's bench), correct by default. Solver-aware callers who own the mutation contract can opt back into the optimization with `same_model=True`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: teach .update() in tutorials; mark setters as syntactic sugar - examples/manipulating-models.ipynb: rewrite mutation cells to use Variable.update / Constraint.update; setter form is mentioned in notes as syntactic sugar for the same call. - examples/creating-constraints.ipynb: reframe the CSRConstraint vs Constraint API table around .update() as the mutation API; setters are sugar. - Setter docstrings now say 'syntactic sugar for Constraint/Variable .update; do not add logic here so the contract stays single-sourced' — a directive to future contributors as much as to readers. No deprecation, no breaking change. .update() is the documented canonical mutation API; the seven setters continue to exist as one-line shims. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * deprecate(update): warn on mutation setters; promote .update() in docs Adds DeprecationWarning to all seven mutation setters (Variable.lower, Variable.upper, Constraint.coeffs, Constraint.vars, Constraint.sign, Constraint.rhs, Constraint.lhs). Each setter still forwards to .update() so existing code keeps working; the warning points at the canonical API. Internal sanitize_zeros migrated off setters (the last linopy/ caller). api.rst gains Modification sections listing .update() for both Variable and Constraint; tutorial notes rewritten to teach .update() and flag setters as deprecated. Release note added. dual.setter / solution.setter untouched — result assignment, not mutation, different deprecation track. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(update): edge-case coverage; document rhs-rearrangement invariant Constraint.update tests: lhs-only, coeffs-only (vars preserved), compound lhs+sign, mutually-exclusive lhs+coeffs and lhs+variables. Variable.update tests: upper-only, valid array bound. Migrate test_constraint_coef_dirty.py from the now-deprecated setters to .update(), exercising the canonical path; add positional-form and no-op cases. Net effect: same dirty-flag invariants, 7 fewer warnings per pytest run. Docs: Constraint.update rhs= gains a worked example showing the two forms (constant vs variable/expression). add_constraints rhs gets a matching note pointing at the linopy invariant so the rearrangement rule is documented at the creation site too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * review(update): address inline feedback on #727 - Constraint._assign_lhs_expr → _assign_lhs (drop redundant suffix; the method already takes a LinearExpression, so the type was in the signature, not the name). - Add Constraint._assign_data(**fields) helper. Wraps the four ``self._data = assign_multiindex_safe(self.data, **kw)`` callsites inside update() (rhs / coeffs / vars / sign). Untouched: the same pattern in dual.setter, sanitize_missings, sanitize_infinities — those aren't update() and stay out of scope here. - Add types.CONSTANT_TYPES tuple, derived from ConstantLike via get_args so the two cannot drift. Variable.update bound validation flipped from negative (isinstance against a hand-rolled non_constant tuple) to positive (isinstance against CONSTANT_TYPES); drops a redundant in-function ``from linopy import expressions`` (the module-level import already covered it). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * deprecate(update): FutureWarning on DataArray as variables=; extract _validate_update Constraint - sanitize_zeros now writes _data via _assign_data directly (no longer round-trips through update(variables=DataArray), which would self-trigger the new deprecation warning). - Constraint.update(variables=...) emits FutureWarning when passed a raw DataArray of integer labels; Variable is the supported input. The path stays accepted for back-compat and will be removed alongside the v1 cleanup. Variable - Extract Variable._validate_update(*, lower, upper) — validates, broadcasts, and runs the cross-bound (lb<=ub) check, returning a dict ready for assignment. update() body shrinks to ~3 lines. Coord validation (parity with add_variables) deferred to #726 land. Tests - test_constraint_coef_dirty's variables= test now passes a Variable instead of a DataArray (matches the supported input). - test_constraint_vars_setter_with_array wrapped in pytest.warns(FutureWarning) — locks in the deprecation contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(constraints): _assign_data → _update_data; manage _coef_dirty inside The helper now writes fields AND flips _coef_dirty when the written set includes coeffs or vars. Callsites in update() (coeffs / vars branches) and sanitize_zeros drop their explicit `self._coef_dirty = True` lines — the rule lives in one place, can't be forgotten by future field additions. rhs / sign writes still don't dirty (correctly). _assign_lhs is untouched: it uses a different write mechanic (drop_vars + assign) and manages its own flag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
FabianHofmann
approved these changes
May 27, 2026
Collaborator
Author
|
@FabianHofmann I think #722 is kind of blocking here atm. Not hard, but will produce conflicts. |
Per PR review: align on the project's `validate_*` naming convention and remove the implicit "AssertionError" connotation of `assert_*`. Pairs naturally with `align_to_coords`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
Author
|
@FabianHofmann Ready for final review. EDIT: If you dont mind, i would merge this and give the final combined PR a proper review. |
FabianHofmann
approved these changes
May 27, 2026
Collaborator
already approved. sounds good to me |
This was referenced May 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #723. Stacked on #725 (which is stacked on #722).
What changes
as_dataarray_in_coordsis folded intoas_dataarrayand split along the seam between "broadcastarragainstcoords" and "enforce thecoordscontract":as_dataarray(arr, coords)— the broadcasting primitive. For every input type, the result is aligned withcoords: positional inputs (numpy / unnamed pandas / scalar) are labeled by position, shared-dim coords are reindexed when values are equal in a different order, dims present incoordsbut not inarrare expanded, and the result is transposed tocoordsorder. Extra dims and disagreeing value sets on shared dims pass through, so xarray broadcasting in expression arithmetic keeps working.assert_compatible_with_coords(arr, coords)— the validation companion. Raises ifarrintroduces dims not incoords(was:as_dataarray_in_coords's extra-dim raise) or if a shared dim has disagreeing coord values (was: its "do not match" raise).add_variablesandadd_constraintsnow callas_dataarrayfollowed byassert_compatible_with_coordsforlower/upper/mask. The previousas_dataarray_in_coordshelper is deleted._coords_to_dictnow filters MultiIndex level coords out ofxarray.Coordinatesinputs, so the new strict-by-default path treats e.g.station(and not its derivedletter/numlevels) as the dim. This was already a latent issue once strict semantics governedas_dataarray's coords arg.Audit summary (the 12 call sites listed in #723)
model.py:705/706lower/upperinadd_variablesas_dataarray+assert_compatible_with_coordsmodel.py:715maskinadd_variablesmodel.py:979maskinadd_constraintsexpressions.py:584/613/1105/1668/2154arithmeticvariables.py:330to_linexpr(coefficient)expressions.py:341/2002/2030/2289,model.py:912/919/972,variables.py:1369Breaking changes for end users
This PR introduces no new user-visible breaking changes. It's a structural refactor that keeps the strict contract at the
add_variables/add_constraintscall sites viaassert_compatible_with_coords. The breaking changes a user sees when this PR lands onmasterare the union of what #722 and #725 already deliver:From #725 (mask handling)
Falseand emitsFutureWarningValueErrormask.reindex({...}, fill_value=False)AssertionErrorValueErrorpd.Series/pd.DataFramemissing a dim was sometimes silently droppedFrom #722 (bounds handling)
Series/DataFramebound missing a dim is silently droppedDataArraybound dim order depends on bound type (#706)coordsorderDataArraybound with extra dim was silently acceptedValueErrorDataArraybound with mismatched coord values raised varied downstream errorsValueError: ...coordinates do not matchPiecewise._broadcast_pointshad hash-randomized dim order across processesRefactor-only changes (no user-visible effect on the strict sites)
as_dataarrayitself no longer raises on extra dims or shared-dim value-set mismatch — every public site that used to depend on those raises now gets them from the explicitassert_compatible_with_coordscall. Surface behavior atadd_variables/add_constraintsis preserved.as_dataarrayitself:test_as_dataarray_with_ndarray_coords_dict_set_dims_not_aligned: extra coord entries now expand into the result (the test was pinning an undocumented "dims wins, coord entry dropped" corner of the old lax path).test_dataarray_extra_dims: rewritten so the subset check fires (rather than the value-mismatch check) — the assertion still raises on extra dims, just via the newassert_compatible_with_coordshelper.Test plan
test/test_common.py— added five tests pinning the new split (extra-dim preservation, disjoint shared-dim values,assert_compatible_with_coordsextra-dim raise, value-mismatch raise, subset-dims allowed).pre-commit run --all-filesclean (ruff/format/blackdoc/codespell).dev-scripts/benchmark_as_dataarray.py, untracked): flat timings vs the base branch on both add_variables-heavy (≈22 ms / 50–57 ms mean) and arithmetic-heavy (≈80–82 ms) workloads.🤖 Generated with Claude Code