Stain normalization: foundations (PR 1 of 7)#1178
Open
timtreis wants to merge 1 commit into
Open
Conversation
88693af to
8ad5476
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1178 +/- ##
==========================================
+ Coverage 73.82% 74.18% +0.36%
==========================================
Files 45 48 +3
Lines 7013 7128 +115
Branches 1188 1202 +14
==========================================
+ Hits 5177 5288 +111
- Misses 1349 1351 +2
- Partials 487 489 +2
🚀 New features to boost your workflow:
|
Substrate for a new histopathology stain-normalization module: canonical Ruifrok H/E/DAB vectors, RGB <-> SDA and RGB <-> Ruderman Lab conversions on xr.DataArray, and a minimal StainReference dataclass holding either a 3x3 stain matrix or Reinhard channel statistics. Lives under squidpy.experimental.im._stain with no public re-export and no SpatialData wiring; fit, apply, persistence, cohort, and augmentation land in follow-up PRs alongside their first consumers. Color conversions stay lazy on dask-backed inputs and compile to a single fused apply_ufunc per chunk so the same primitives serve test patches and whole-slide H&E without rewrite. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
8ad5476 to
2d7143c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Substrate for a new histopathology stain-normalization module. Targets Squidpy 2.0. No public API surface in this PR — every new symbol lives under the private
squidpy.experimental.im._stainnamespace and there are no re-exports.What ships:
_constants.py— Ruifrok H/E/DAB canonical stain vectors (derived fromskimage.color.rgb_from_hed), Ruderman RGB <-> LMS <-> Lab matrices, HistomicsTK-compatible SDA scale. Nothing else; no shared "default background", no luminosity threshold, no schema version._conversion.py—rgb_to_sda/sda_to_rgbandrgb_to_lab_ruderman/lab_ruderman_to_rgbonxr.DataArray. Each public function compiles to a single fusedapply_ufuncper chunk; numpy- and dask-backed inputs both stay on the same code path; dask-backed inputs stay lazy end-to-end.background_intensityis a requirednp.ndarrayof shape(3,)— there is no library-wide default because no scanner produces a pure-white background._reference.py— minimal frozenStainReferencedataclass holding either a(3, 3)stain matrix (Macenko/Vahadane, ships in PR 3) ormu/sigmaRuderman Lab channel statistics (Reinhard, ships in PR 2).background_intensityis required for decomposition methods and forbidden for Reinhard (Reinhard's color transfer is in Ruderman Lab and doesn't model absorbance). Cross-field validation only; no persistence, no cohort fields, no provenance metadata.Deliberately deferred to the PRs that actually consume them:
save/load, schema versioning,fit_metadata,cohort_members,per_image_stats,max_concentrations— none have a producer in PR 1. Adding them now would version artifacts we have not shipped._validation.py(canonical reorder, third-column completion,StainFittingError, stain-matrix angle/rank checks) — purely supports the Macenko/Vahadane fit path, lands with it in PR 3.DEFAULT_BACKGROUND_INTENSITYconstant or default kwarg — pure-white[255, 255, 255]is wrong for every real scanner. PR 3 shipsestimate_background_intensity; until then callers must pass an estimate or, knowingly, an explicitnp.array([255., 255., 255.]).Design decisions:
experimental/im/alongside the existing SpatialData-native modules (_detect_tissue,_qc_image,_make_tiles,_feature). The eventualexperimental/im -> impromotion is a separate v2.0 effort.xr.DataArray-native from the start, not numpy-only, so PR 2's lazy apply path reuses them unchanged.(rgb + 1) / (I_0 + 1)so that pixels at the supplied white point map exactly to zero. Documented inrgb_to_sdadocstring.sdata.images[key](starting in PR 2 via the existingexperimental/im/_utils.py::get_element_datahelper); this module never adds a loader.