Skip to content

Fix bump OOM: int32 coords, count cap, per-chunk dask partitioning (#1206)#1208

Merged
brendancol merged 1 commit intomasterfrom
issue-1206
Apr 16, 2026
Merged

Fix bump OOM: int32 coords, count cap, per-chunk dask partitioning (#1206)#1208
brendancol merged 1 commit intomasterfrom
issue-1206

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Fixes #1206. Three issues in bump() that cause wrong results and OOM on large rasters:

  • uint16 coordinate overflow: locs used dtype=np.uint16, so coordinates > 65,535 wrapped silently. Now uses int32.
  • Unbounded default count: count defaulted to w * h // 10 with no ceiling. Now capped at 10M bumps, plus a memory guard that raises MemoryError when the arrays would exceed 50% of available RAM.
  • Closure serialization in dask paths: _bump_dask_numpy and _bump_dask_cupy captured the full locs and heights arrays in every chunk closure. At 2048x2048 default count with 128x128 chunks, that was 5 MB/chunk across 256 chunks = 1.29 GB of graph payload. Now pre-partitions bumps by chunk using dask.delayed + da.block, so each task only serializes its own subset. Graph serialization reduced ~250x.

Test plan

  • Existing bump tests pass (all 10)
  • New test: int32 coordinates work at 70,000 (no uint16 wrap)
  • New test: default count capped for 20,000 x 20,000 raster
  • New test: MemoryError raised for huge explicit count
  • New test: dask graph size stays proportional to total bumps, not bumps * chunks
  • New test: single-chunk dask matches numpy bitwise

…oning (#1206)

- Change locs dtype from uint16 to int32 to fix silent coordinate
  wrap-around for rasters with dimensions > 65535
- Cap default count (w*h//10) at 10M bumps to prevent unbounded
  eager allocation on large rasters
- Add memory guard that raises MemoryError when bump arrays would
  exceed 50% of available RAM
- Replace closure-based dask paths with per-chunk partitioning via
  dask.delayed + da.block, so each task only serializes its own
  bump subset instead of the entire locs/heights arrays
@github-actions github-actions bot added the performance PR touches performance-sensitive code label Apr 16, 2026
@brendancol brendancol merged commit c4bb608 into master Apr 16, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bump: uint16 overflow, unbounded default count, and closure serialization OOM

1 participant