Skip to content

geotiff: first-class backend parity contract with a golden test corpus #1930

@brendancol

Description

@brendancol

Reason or Problem

The geotiff module spans many roles: TIFF parser, COG range client, VRT engine, CRS normaliser, CPU writer, streaming writer, GPU reader and writer, and a security boundary. The test count is high, which is good, but the number of backend-specific parity patches in recent PRs (and the per-finding comments referencing them) is a signal that behaviour drift between backends is a persistent risk.

The current safety net is implicit. Cross-backend parity is asserted only where a specific bug surfaced. There is no single contract that says "eager, dask, GPU, VRT, HTTP, and fsspec must all agree on pixels, transform, CRS, nodata, dtype, attrs."

Proposal

Make backend parity a first-class contract with a small golden corpus.

Design:
Curate a corpus of small TIFFs that cover the dimensions where drift has shown up: tiled vs stripped, planar vs chunky, byte order, dtype range (int/uint/float, native and big-endian, predictor=2/3), compression (none, deflate, lerc, jpeg, lzw), nodata sentinels (int + NaN + miniswhite), overviews, GDAL_METADATA, and CRS variants (EPSG, WKT, citation-only).

For each file, run it through every backend that supports it (eager, dask, GPU, VRT, HTTP, fsspec where applicable) and assert exact match against a rasterio / GDAL baseline on pixels, transform, CRS, nodata, dtype, and attrs.

Usage:
The corpus runs as part of the normal test suite. New backends or kwargs that change pixel-level output bake in a new baseline only when the change is intentional.

Value:
Today each parity bug surfaces from a different angle and is fixed in isolation. A golden corpus catches drift in a single place and makes accepting a new pixel-level change an explicit baseline update rather than an implicit one.

Drawbacks

  • Maintenance cost: baseline updates need review.
  • Corpus growth: keep it small (under 50 files) and put large fixtures behind a marker.
  • Requires rasterio / GDAL in CI for the comparison run.

Alternatives

  • Continue with per-bug parity tests. Lower up-front cost, higher long-tail cost.
  • Property-based / hypothesis-style parity. Useful but harder to debug when it fails.

Stakeholders and Impacts

Affects every backend. Owner is whoever lands the corpus. Impact is on CI run time (modest), and on the workflow for any future change that produces different pixels (must explicitly bless the baseline).

Additional Notes or Context

Surfaced by an external review of the geotiff module. Filed as a tracking issue, not a code change in the same review pass.

Metadata

Metadata

Assignees

No one assigned

    Labels

    QA/QCbackend-coverageAdding missing dask/cupy/dask+cupy backend supportenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions