geotiff: first-class backend parity contract with a golden test corpus

## Reason or Problem
The geotiff module spans many roles: TIFF parser, COG range client, VRT engine, CRS normaliser, CPU writer, streaming writer, GPU reader and writer, and a security boundary. The test count is high, which is good, but the number of backend-specific parity patches in recent PRs (and the per-finding comments referencing them) is a signal that behaviour drift between backends is a persistent risk.

The current safety net is implicit. Cross-backend parity is asserted only where a specific bug surfaced. There is no single contract that says "eager, dask, GPU, VRT, HTTP, and fsspec must all agree on pixels, transform, CRS, nodata, dtype, attrs."

## Proposal
Make backend parity a first-class contract with a small golden corpus.

**Design:**
Curate a corpus of small TIFFs that cover the dimensions where drift has shown up: tiled vs stripped, planar vs chunky, byte order, dtype range (int/uint/float, native and big-endian, predictor=2/3), compression (none, deflate, lerc, jpeg, lzw), nodata sentinels (int + NaN + miniswhite), overviews, GDAL_METADATA, and CRS variants (EPSG, WKT, citation-only).

For each file, run it through every backend that supports it (eager, dask, GPU, VRT, HTTP, fsspec where applicable) and assert exact match against a `rasterio` / GDAL baseline on pixels, transform, CRS, nodata, dtype, and attrs.

**Usage:**
The corpus runs as part of the normal test suite. New backends or kwargs that change pixel-level output bake in a new baseline only when the change is intentional.

**Value:**
Today each parity bug surfaces from a different angle and is fixed in isolation. A golden corpus catches drift in a single place and makes accepting a new pixel-level change an explicit baseline update rather than an implicit one.

## Drawbacks
- Maintenance cost: baseline updates need review.
- Corpus growth: keep it small (under 50 files) and put large fixtures behind a marker.
- Requires rasterio / GDAL in CI for the comparison run.

## Alternatives
- Continue with per-bug parity tests. Lower up-front cost, higher long-tail cost.
- Property-based / hypothesis-style parity. Useful but harder to debug when it fails.

## Stakeholders and Impacts
Affects every backend. Owner is whoever lands the corpus. Impact is on CI run time (modest), and on the workflow for any future change that produces different pixels (must explicitly bless the baseline).

## Additional Notes or Context
Surfaced by an external review of the geotiff module. Filed as a tracking issue, not a code change in the same review pass.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

geotiff: first-class backend parity contract with a golden test corpus #1930

Reason or Problem

Proposal

Drawbacks

Alternatives

Stakeholders and Impacts

Additional Notes or Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

geotiff: first-class backend parity contract with a golden test corpus #1930

Description

Reason or Problem

Proposal

Drawbacks

Alternatives

Stakeholders and Impacts

Additional Notes or Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions