Skip to content

Conversation

@brokkoli71
Copy link
Member

@brokkoli71 brokkoli71 commented Jan 30, 2026

Changes

API Changes

  • Changed Array.read() shape parameter from int[] to long[] for consistency with offset parameter and to support arrays with dimensions > 2GB (fixes ArrayMetadata.shape: long vs Array.read: int #29)
  • Updated ArrayAccessor.withShape() and IndexingUtils methods to use long[] throughout the indexing pipeline

Critical Bug Fixes

  • Fixed integer overflow in computeChunkCoords() by changing numChunks to long with validation, preventing allocation failures
  • Added overflow validation in computeProjection() for all long-to-int casts with descriptive error messages to prevent silent data truncation

HTTP/S3 Store Fixes

  • Fixed HTTP Range header capitalization from "Bytes=" to RFC 7233 compliant "bytes=" for compatibility with strict servers
  • Added HTTP response status validation to prevent serving error pages (404/500) as valid data
  • Optimized S3Store.set() to use RequestBody.fromBytes() for proper content-length specification

Security Fix

  • Fixed path traversal vulnerability in FilesystemStore by validating normalized absolute paths to ensure they remain within the store root.

Performance Optimizations

  • Eliminated redundant I/O in Array.read() by removing the exists() check before reading chunks. This reduces I/O operations by 50% for reads, especially benefiting network stores (HTTP/S3) by removing redundant HEAD requests. Instead rely on returned null values from store. (see "Enforced strict exception handling across all stores")
  • Improved sparse shard handling in ShardingIndexedCodec by efficiently returning fill values for missing shards without unnecessary allocations or exceptions.

Exception Handling Improvements

  • Enforced strict exception handling across all stores (updated FilesystemStore, HttpStore, ReadOnlyZipStore), ensuring null is returned only for missing files/keys (NoSuchFileException/404) while all other I/O errors are propagated as StoreException.
  • Created StoreException class with factory methods providing rich context (store path, key, underlying cause)
  • Enhanced all store implementations with detailed error messages including paths, URLs, AWS status codes
  • Improved Array read/write exceptions to report specific chunk coordinates during parallel operations
  • Replaced generic RuntimeException with IllegalStateException for programming errors and ZarrException for user errors

@brokkoli71 brokkoli71 requested a review from normanrz January 30, 2026 19:51
@brokkoli71 brokkoli71 self-assigned this Jan 30, 2026
@brokkoli71 brokkoli71 marked this pull request as ready for review January 30, 2026 20:24
@brokkoli71 brokkoli71 mentioned this pull request Feb 2, 2026
@brokkoli71 brokkoli71 merged commit 4e4eea9 into main Feb 2, 2026
6 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ArrayMetadata.shape: long vs Array.read: int

3 participants