Cut head_tail_breaks and box_plot dask re-scans by brendancol · Pull Request #1213 · xarray-contrib/xarray-spatial

brendancol · 2026-04-16T16:06:20Z

Summary

_run_dask_head_tail_breaks: persist data_clean once, track the running mask count across iterations, and fuse the mean and head-count reductions into a single dask.compute() call per iteration. Cuts per-iteration graph traversals from 3 to 1 and eliminates the re-read on every loop pass.
_run_dask_box_plot (new) and _run_dask_cupy_box_plot: replace data_clean[da.isfinite(data_clean)] (which forces compute_chunk_sizes) with the same seeded _generate_sample_indices sampler that natural_breaks and quantile already use. Percentiles are then computed on the finite portion of the sample in numpy.

Motivation

Static analysis flagged three HIGH-severity patterns on the dask backends of classify:

_run_dask_head_tail_breaks ran .compute() inside a while loop for the mean, new-mask count, and total-mask count — 3 full graph traversals per iteration, N+1 iterations typical.
_run_box_plot(..., module=da) used boolean fancy indexing on a dask array, which triggers compute_chunk_sizes() and performs an extra full scan before da.percentile runs.
_run_dask_cupy_box_plot had the same pattern plus a full map_blocks(cupy.asnumpy) over the dataset before sampling.

Benchmark

head_tail_breaks dask path on a 256×256 gamma-distributed float64 array, chunks=64:

Backend	Metric	Before	After	Ratio	Verdict
dask+numpy	wall_ms (med)	912	339	0.37	IMPROVED

box_plot dask path on 512×512, chunks=128:

Backend	Metric	After	Verdict
dask+numpy	wall_ms (med)	57	OK (no baseline — old path scaled with full-raster scan before percentile)

Test plan

pytest xrspatial/tests/test_classify.py — 85 tests pass
Manual smoke: head_tail_breaks dask output has the same bin count as numpy path on the same seed
Manual smoke: box_plot dask output uses sampled quantiles; verify output classes match numpy path within sampling tolerance

Notes

Sample size for the box_plot dask path is capped at 200,000 elements (or the full dataset if smaller). This matches the pattern used by natural_breaks and keeps the percentile computation O(sample) rather than O(dataset).

head_tail_breaks (dask) called .compute() three times per iteration of its while-loop (mean, new-mask count, old-mask count) and rebuilt the same data_clean graph every time. For N iterations that was 3N+1 full graph traversals. Persist data_clean once, track the running mask count across iterations, and fuse the mean+head-count reductions into a single dask.compute() per iteration. Wall time drops from ~910 ms to ~340 ms on 256x256 chunks=64. box_plot (dask and dask+cupy) did data_clean[da.isfinite(data_clean)] which is boolean fancy indexing on a dask array. That forces compute_chunk_sizes, materializing a full scan just to know the output chunk layout before percentile can run. Swap in the same seeded _generate_sample_indices sampler that natural_breaks/quantile already use: gather 200k indices on the dask array, compute the sample and the global nanmax in one dask.compute() call, and take percentiles on the finite portion of the sample in numpy.

github-actions bot added the performance PR touches performance-sensitive code label Apr 16, 2026

brendancol merged commit 7fa9e04 into master Apr 16, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cut head_tail_breaks and box_plot dask re-scans#1213

Cut head_tail_breaks and box_plot dask re-scans#1213
brendancol merged 1 commit intomasterfrom
perf/classify-head-tail-and-box-plot

brendancol commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented Apr 16, 2026

Summary

Motivation

Benchmark

Test plan

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant