Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
d51cda8
fix(variables): broadcast and order pandas/DataArray bounds in coords
FBumann May 23, 2026
a246006
docs(variables): frame add_variables coords as source of truth
FBumann May 23, 2026
aa0c80d
docs: frame bounds fix as extending 0.7.0's coords-as-truth fix
FBumann May 23, 2026
4ddc3c2
docs: reword as "extend and finalize", emphasize hardening
FBumann May 23, 2026
5557a9f
docs: rephrase as "0.7.0 made ... this release closes the two remaini…
FBumann May 23, 2026
cdc987b
docs: spell out dims/order/values in coords-as-truth bullet
FBumann May 23, 2026
001d071
test(variables): cover pandas MultiIndex bounds and dim reindex
FBumann May 23, 2026
bca89e7
refactor: move as_dataarray_in_coords to common.py
FBumann May 23, 2026
b28f3df
refactor(common): simplify _named_pandas_to_dataarray + cover edge br…
FBumann May 24, 2026
9b4d7cc
fix(common): only accept string axis names in _named_pandas_to_dataarray
FBumann May 24, 2026
7705156
fix(common): align positional inputs to coords, with clear shape errors
FBumann May 24, 2026
26f3e73
fix(sos): use var.indexes[d] for reformulated bounds; widen _coords_t…
FBumann May 24, 2026
095b510
fix(common): tighten _coords_to_dict to raise on non-pd.Index entries
FBumann May 24, 2026
68c4e09
fix(common): proper MultiIndex support in coords helpers (#729)
FabianHofmann May 27, 2026
a3d6f59
fix: apply coords-as-truth rule to mask in add_variables/add_constrai…
FBumann May 27, 2026
48de61b
refactor: unify as_dataarray; split broadcasting from coords validati…
FBumann May 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions doc/release_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ Most users should keep calling ``model.solve(...)``. If you want more control, y

**Bug Fixes**

* ``Model.add_variables``: 0.7.0 made ``coords`` (dims, order, and values) the source of truth for ``DataArray`` bounds; this release closes the two remaining gaps. Pandas ``Series`` / ``DataFrame`` bounds missing a dimension are broadcast to ``coords`` instead of being silently dropped (`#709 <https://github.com/PyPSA/linopy/issues/709>`__), and the variable's dimension order always follows ``coords`` regardless of bound type (`#706 <https://github.com/PyPSA/linopy/issues/706>`__).
* ``add_variables`` / ``add_constraints``: the same rule now applies to ``mask`` — pandas ``Series`` / ``DataFrame`` masks missing a dimension are broadcast to the variable/constraint shape. As previously announced via ``FutureWarning``, masks whose coordinates are a sparse subset of the data's coordinates now raise ``ValueError`` rather than silently filling missing entries with ``False``; masks with dimensions not in the data raise ``ValueError`` instead of ``AssertionError``.
* ``add_piecewise_formulation`` now produces a reproducible dimension order in the broadcast breakpoint array. The previous set-based expansion gave a hash-randomized order that varied between processes.
* SOS constraints on masked variables no longer cause solver-specific failures (Gurobi ``IndexError``, Xpress ``?404 Invalid column number``, LP parse errors, silent set corruption). ``Model.solve()`` and ``Model.to_file()`` now raise a clear ``NotImplementedError`` referring users to `#688 <https://github.com/PyPSA/linopy/issues/688>`__; pass ``reformulate_sos=True`` as a workaround.
* ``Model.solve(..., reformulate_sos=True)`` now actually reformulates SOS constraints even when the solver supports them natively. Previously it was silently ignored with a warning.

Expand All @@ -64,6 +67,14 @@ Most users should keep calling ``model.solve(...)``. If you want more control, y

**Internal**

* ``linopy.common.as_dataarray`` is now the single broadcasting primitive;
strict subset-dim / coord-value checks live in
``validate_alignment`` (via ``align_to_coords`` in
``add_variables`` / ``add_constraints``). Validation errors name the
argument (``lower bound``, ``upper bound``, ``mask``) and explain whether
dimensions or coordinate values disagree with ``coords``. When ``coords`` is
a mapping, extra keys beyond the positional ``dims`` are broadcast in rather
than dropped.
* Each ``Solver`` subclass now overrides at most three hooks: ``_build_direct`` (build the native model), ``_run_direct`` (run it), and ``_run_file`` (run the solver on an LP/MPS file). File-only solvers (CBC, GLPK, CPLEX, SCIP, Knitro, COPT, MindOpt) only override ``_run_file``.
* New ``ConstraintLabelIndex`` cached on ``Model.constraints`` (mirrors the existing ``Variables.label_index``); ``ConstraintBase`` gains ``active_labels()`` and a ``range`` property; ``CSRConstraint`` exposes ``coords``.
* ``linopy.common`` gains ``values_to_lookup_array``; the legacy pandas-based helpers ``series_to_lookup_array`` and ``lookup_vals`` are removed.
Expand Down
338 changes: 320 additions & 18 deletions linopy/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

import operator
import os
from collections.abc import Callable, Generator, Hashable, Iterable, Sequence
from collections.abc import Callable, Generator, Hashable, Iterable, Mapping, Sequence
from functools import cached_property, partial, reduce, wraps
from pathlib import Path
from typing import TYPE_CHECKING, Any, Generic, TypeVar, overload
Expand All @@ -23,6 +23,7 @@
from xarray import Coordinates, DataArray, Dataset, apply_ufunc, broadcast
from xarray import align as xr_align
from xarray.core import dtypes, indexing
from xarray.core.coordinates import CoordinateValidationError
from xarray.core.types import JoinOptions, T_Alignable
from xarray.namedarray.utils import is_dict_like

Expand Down Expand Up @@ -213,30 +214,20 @@ def numpy_to_dataarray(
return DataArray(arr, coords=coords, dims=dims, **kwargs)


def as_dataarray(
def _as_dataarray_lax(
arr: Any,
coords: CoordsLike | None = None,
dims: DimsLike | None = None,
**kwargs: Any,
) -> DataArray:
"""
Convert an object to a DataArray.
Type-dispatched DataArray conversion without any coords validation.

Parameters
----------
arr:
The input object.
coords (Union[dict, list, None]):
The coordinates for the DataArray. If None, default coordinates will be used.
dims (Union[list, None]):
The dimensions for the DataArray. If None, the dimensions will be automatically generated.
**kwargs:
Additional keyword arguments to be passed to the DataArray constructor.

Returns
-------
DataArray:
The converted DataArray.
This is the conversion primitive used by ``as_dataarray``: it picks the
right constructor for each supported input type but does not check the
result against ``coords``. Callers that need ``coords`` to govern the
output (dim order, shared-dim values, missing-dim expansion) should use
``as_dataarray`` instead.
"""
if isinstance(arr, pd.Series | pd.DataFrame):
arr = pandas_to_dataarray(arr, coords=coords, dims=dims, **kwargs)
Expand Down Expand Up @@ -275,6 +266,317 @@ def as_dataarray(
return arr


def as_dataarray(
arr: Any,
coords: CoordsLike | None = None,
dims: DimsLike | None = None,
**kwargs: Any,
) -> DataArray:
"""
Convert ``arr`` to a DataArray and broadcast it against ``coords``.

When ``coords`` carries named dimensions, the result is aligned with
those coords:

- positional inputs (numpy, polars, unnamed pandas, scalar) are labeled
with the coord dim names by position;
- for every dim shared between ``arr`` and ``coords``, same-values-
different-order coordinates are reindexed to ``coords`` order;
- dims present in ``coords`` but not in ``arr`` are expanded to the
``coords`` shape;
- the result is transposed to ``coords`` order.

Dimensions present in ``arr`` but not in ``coords`` are preserved so
standard xarray broadcasting keeps working. Disagreeing coord values
on a shared dim (i.e. value sets that are not equal as sets) are
passed through unchanged: downstream xarray alignment decides how to
combine them. To enforce that ``arr.dims`` ⊆ ``coords.dims`` and that
shared coord values match, use ``validate_alignment`` (called
automatically for ``lower``, ``upper``, and ``mask`` in
:meth:`~linopy.model.Model.add_variables` and for ``mask`` in
:meth:`~linopy.model.Model.add_constraints`).

Parameters
----------
arr
Input scalar / list / numpy / polars / pandas / DataArray.
coords
Mapping of dim name → coord values, or a sequence of ``pd.Index``
/ unnamed sequences. ``None`` falls back to xarray's default
labeling (no broadcasting).
dims
Optional dim-names hint, used for positional inputs and to bias
pandas-axis interpretation.
**kwargs
Forwarded to the underlying DataArray construction.

Returns
-------
DataArray
Broadcast against ``coords`` (extra dims preserved).
"""
if coords is None:
return _as_dataarray_lax(arr, coords, dims, **kwargs)

expected = _coords_to_dict(coords, dims=dims)
if not expected:
return _as_dataarray_lax(arr, coords, dims, **kwargs)

if isinstance(arr, pd.Series | pd.DataFrame):
converted = _named_pandas_to_dataarray(arr)
if converted is not None:
arr = converted

if not isinstance(arr, DataArray):
# numpy/polars/unnamed-pandas inputs are positional — their only
# meaningful information is the values; any axis labels are
# auto-generated. Default dims to coords' keys so the lax conversion
# labels axes correctly (instead of dim_0/dim_1), then re-assign
# coords from expected so positional inputs align to coords by
# position. A shape mismatch surfaces here as a clear xarray
# "conflicting sizes" error rather than a confusing
# "coordinates do not match" further down.
if dims is None:
dims = list(expected)
arr = _as_dataarray_lax(arr, coords, dims=dims, **kwargs)
# Skip MultiIndex dims — re-assigning a PandasMultiIndex coord emits
# a FutureWarning and isn't needed (the lax pass already used it).
arr = arr.assign_coords(
{
d: expected[d]
for d in arr.dims
if d in expected and not isinstance(arr.indexes.get(d), pd.MultiIndex)
}
)

for dim, coord_values in expected.items():
if dim not in arr.dims:
continue
if isinstance(arr.indexes.get(dim), pd.MultiIndex):
continue
expected_idx = (
coord_values
if isinstance(coord_values, pd.Index)
else pd.Index(coord_values)
)
actual_idx = arr.coords[dim].to_index()
if actual_idx.equals(expected_idx):
continue
# Same values, different order → reindex to match expected order.
# Different value sets are left alone: downstream xarray alignment
# (e.g. xr.align in arithmetic) handles them. Callers needing strict
# value matching (add_variables / add_constraints) should use
# ``validate_alignment`` after this call.
if len(actual_idx) == len(expected_idx) and set(actual_idx) == set(
expected_idx
):
arr = arr.reindex({dim: expected_idx})

# expand_dims prepends new dimensions and their coordinate variables;
# the subsequent transpose restores coords order. Both are no-ops when
# the array already matches. Reconstruct so the DataArray's coords
# iteration order also follows coords (a Dataset built from this picks
# up its dim order from coord insertion).
expand = {k: v for k, v in expected.items() if k not in arr.dims}
if expand:
arr = arr.expand_dims(expand)

target_dims = tuple(d for d in expected if d in arr.dims) + tuple(
d for d in arr.dims if d not in expected
)
arr = arr.transpose(*target_dims)

coord_order = [c for c in target_dims if c in arr.coords] + [
c for c in arr.coords if c not in target_dims
]
if list(arr.coords) != coord_order:
arr = DataArray(
arr.variable,
coords={c: arr.coords[c] for c in coord_order},
name=arr.name,
)

return arr


def validate_alignment(
arr: DataArray,
coords: CoordsLike | None,
dims: DimsLike | None = None,
*,
label: str | None = None,
) -> None:
"""
Raise ``ValueError`` if ``arr`` is incompatible with ``coords``.

``arr`` is compatible with ``coords`` when both of the following hold:

- every dim in ``arr.dims`` is also a dim in ``coords`` (no extras);
- for every dim shared between ``arr`` and ``coords``, the coord
values are equal.

``dims`` mirrors the ``dims`` argument of ``as_dataarray``: it names
unnamed entries in a sequence-form ``coords`` by position, so
``coords=[[1, 2, 3]], dims=["x"]`` is enforced the same way as
``coords={"x": [1, 2, 3]}``.

``label`` names the argument in error messages (e.g. ``"lower bound"``).

No-op when ``coords`` is ``None`` or carries no named dimensions.
"""
if coords is None:
return
expected = _coords_to_dict(coords, dims=dims)
if not expected:
return
subject = label or "Value"
expected_dims = set(expected)
extra = set(arr.dims) - expected_dims
if extra:
raise ValueError(
f"{subject} has dimension(s) {sorted(extra)} not declared in coords "
f"({sorted(expected_dims)}). Add them to coords or remove them from "
f"{subject.lower()}."
)
for dim, coord_values in expected.items():
if dim not in arr.dims:
continue
expected_is_mi = isinstance(coord_values, pd.MultiIndex)
actual_is_mi = isinstance(arr.indexes.get(dim), pd.MultiIndex)
if expected_is_mi or actual_is_mi:
if expected_is_mi and actual_is_mi:
if not arr.indexes[dim].equals(coord_values):
raise ValueError(
f"{subject}: MultiIndex for dimension {dim!r} does not "
f"match coords."
)
continue
expected_idx = (
coord_values
if isinstance(coord_values, pd.Index)
else pd.Index(coord_values)
)
actual_idx = arr.coords[dim].to_index()
if not actual_idx.equals(expected_idx):
raise ValueError(
f"{subject}: coordinate values for dimension {dim!r} do not match "
f"coords — expected {expected_idx.tolist()}, got "
f"{actual_idx.tolist()}."
)


def align_to_coords(
value: Any,
coords: CoordsLike | None,
*,
label: str,
**kwargs: Any,
) -> DataArray:
"""
Convert ``value`` with :func:`as_dataarray` and enforce the coords contract.

Used by :meth:`~linopy.model.Model.add_variables` for ``lower``, ``upper``,
and ``mask``, and by :meth:`~linopy.model.Model.add_constraints` for
``mask``. Raises :class:`ValueError` with a message that names ``label``
when conversion or validation fails.
"""
try:
da = as_dataarray(value, coords, **kwargs)
except TypeError as err:
raise TypeError(f"{label} could not be aligned to coords: {err}") from err
except (ValueError, CoordinateValidationError) as err:
raise ValueError(f"{label} could not be aligned to coords: {err}") from err
validate_alignment(da, coords, dims=kwargs.get("dims"), label=label)
return da


def _coords_to_dict(
coords: Sequence[Sequence | pd.Index] | Mapping,
dims: DimsLike | None = None,
) -> dict[str, Any]:
"""
Normalize coords to a dict mapping dim names to coordinate values.

For ``xarray.Coordinates`` (and ``DataArray.coords``), only entries
that are actual dimensions are kept; derived MultiIndex level coords
are dropped here and re-attached by xarray downstream. Plain mappings
are returned as-is. For sequence inputs, entries must be ``pd.Index``
(named or not) or unnamed sequences (``list`` / ``tuple`` / ``range``
/ ``np.ndarray``). A ``pd.MultiIndex`` must have ``.name`` set —
xarray requires a single dimension name for the flattened index.
Other types — notably ``xarray.DataArray`` — raise ``TypeError``
rather than being silently dropped: callers should convert via
``variable.indexes[<dim>]`` (or ``pd.Index(...)``) first.

Unnamed sequence entries (or unnamed ``pd.Index``) gain a dim name
from ``dims`` by position when ``dims`` is provided, so callers that
pass ``coords=[[1, 2, 3]], dims=["x"]`` get the same strict
enforcement as ``coords={"x": [1, 2, 3]}``.
"""
if isinstance(coords, Coordinates):
# Coordinates iterates over every coord variable, including
# MultiIndex level coords. Keep only the entries that are dims.
return {d: coords[d] for d in coords.dims if d in coords}
if isinstance(coords, Mapping):
return dict(coords)
dim_names: list[Any] | None = None
if dims is not None:
dim_names = list(dims) if isinstance(dims, list | tuple) else [dims]
result: dict[str, Any] = {}
for i, c in enumerate(coords):
if isinstance(c, pd.MultiIndex):
if not c.name:
raise TypeError(
"MultiIndex coords entries must have .name set so "
"xarray can use it as the dimension name. Set it via "
"`idx.name = 'my_dim'` before passing to coords."
)
result[c.name] = c
elif isinstance(c, pd.Index):
name = (
c.name
if c.name
else (dim_names[i] if dim_names and i < len(dim_names) else None)
)
if name is not None:
result[name] = c
elif isinstance(c, list | tuple | range | np.ndarray):
if dim_names and i < len(dim_names):
result[dim_names[i]] = pd.Index(c, name=dim_names[i])
else:
raise TypeError(
f"coords entries must be pd.Index or an unnamed sequence "
f"(list / tuple / range / numpy.ndarray); got "
f"{type(c).__name__}. For an xarray DataArray coord, pass "
f"`variable.indexes[<dim>]` (a pd.Index) instead."
)
return result


def _named_pandas_to_dataarray(arr: pd.Series | pd.DataFrame) -> DataArray | None:
"""
Convert a pandas Series or DataFrame with fully named axes to a DataArray.

Returns ``None`` if any axis (or MultiIndex level) is unnamed or
non-string, so the caller can fall back to ``as_dataarray``.
"""
names = list(arr.index.names)
if isinstance(arr, pd.DataFrame):
names += list(arr.columns.names)
if any(not isinstance(n, str) for n in names):
return None

if isinstance(arr, pd.DataFrame):
if isinstance(arr.index, pd.MultiIndex) or isinstance(
arr.columns, pd.MultiIndex
):
arr = arr.stack(list(range(arr.columns.nlevels)), future_stack=True)
return arr.to_xarray()
return DataArray(arr)

return arr.to_xarray()


def broadcast_mask(mask: DataArray, labels: DataArray) -> DataArray:
"""
Broadcast a boolean mask to match the shape of labels.
Expand Down
Loading
Loading