Clearer error when shape dimension overflows int32 by serenposh · Pull Request #3425 · ml-explore/mlx

serenposh · 2026-04-18T23:42:56Z

Summary

mx.zeros(2**31) (and ones / full) previously raised a generic nanobind error that gave the user no hint of the real problem:

TypeError: zeros(): incompatible function arguments. The following argument types are supported:
    1. zeros(shape: Union[int, Sequence[int]], dtype: Optional[Dtype] = float32, ...) -> array
Invoked with types: int

The underlying cause is that mx::ShapeElem is int32_t, so any dimension >= 2**31 can't be converted via the int / mx::Shape variant that nanobind sees — but nothing in the error points at the shape or the 32-bit limit.

After this PR:

ValueError: Shape dimension 2147483648 is outside the supported range [-2147483648, 2147483647]. MLX currently uses 32-bit integers for shape dimensions.

Closes #2681.

Changes

python/src/convert.{h,cpp}: check_shape_dim now reports the offending value and the valid range, and catches negative overflow too. It's exposed in the header so other bindings can reuse it.
python/src/ops.cpp: full, zeros, and ones accept variant<int64_t, vector<int64_t>> and route through a new to_shape helper that validates each dim via check_shape_dim.
python/tests/test_ops.py: adds test_shape_overflow_error covering the scalar and sequence paths for all three constructors.

Scope

This PR does not raise the underlying int32 shape limit — the tracking issue calls out that mx::ShapeElem → int64_t would be a much larger migration. It only improves the diagnostic so users hitting the limit understand what they hit.

Test plan

python -m unittest python.tests.test_ops.TestOps — 139 tests pass locally (CPU build, macOS arm64).
New test test_shape_overflow_error verifies both the scalar (mx.zeros(2**31)) and sequence (mx.zeros([2**31])) paths for zeros, ones, and full.
Existing shapes (small ints, tuples, lists) still work unchanged.

🤖 Generated with Claude Code

Previously `mx.zeros(2**31)` (and `ones`/`full`) raised a generic nanobind error: TypeError: zeros(): incompatible function arguments. ... Invoked with types: int The underlying cause is that `mx::ShapeElem` is `int32_t`, so values >= 2**31 can't be converted via the `int`/`mx::Shape` variant that nanobind sees — but the user gets no hint of this. Widen the Python-side shape acceptance for `full`, `zeros`, and `ones` to `int64_t` / `vector<int64_t>` and validate each dimension through `check_shape_dim`, which now reports the offending value and the supported range: ValueError: Shape dimension 2147483648 is outside the supported range [-2147483648, 2147483647]. MLX currently uses 32-bit integers for shape dimensions. This does not raise the underlying int32 shape limit — only the diagnostic when users hit it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

zcbenz

Thanks for trying to fix this, checking the lower limit feels the correct fix but this PR only covers a few ops while we would need to fix all the ops that take shapes. I think a better approach is to check the overflow in python/src/small_vector.h.

Per review feedback on ml-explore#2681, move the int32 overflow check into the SmallVector type caster (python/src/small_vector.h) so it applies to every op that takes an mx::Shape, not just the three creation ops. For narrow integer element types (int32, int16, ...) the caster now widens each element through `long long`, validates against the element type's range, and throws `nanobind::value_error` on overflow — nanobind then surfaces a clean Python ValueError that names the offending value and the valid range: mx.reshape(a, [2**31]) mx.broadcast_to(a, [2**31, 1]) mx.zeros([2**31]) # -> ValueError: Shape dimension 2147483648 is outside the # supported range [-2147483648, 2147483647]. ... Because the SmallVector caster throws, it can't live inside a `std::variant` — nanobind's variant caster is marked noexcept and would call std::terminate on any escaping exception. So `zeros`, `ones` and `full` are split into two nb::def overloads each (scalar int64_t + mx::Shape) instead of using `variant<int, mx::Shape>`. The scalar overload still routes through `check_shape_dim` for the same clean error on `mx.zeros(2**31)`. Broaden the Python test to exercise reshape / broadcast_to / negative overflow in addition to the three creation ops. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

serenposh · 2026-04-20T19:05:42Z

Thanks for the review! Pushed a follow-up (70509dd) that moves the check to python/src/small_vector.h as you suggested.

The caster now widens each narrow-integer shape element through long long, validates against the element type's range, and throws nb::value_error on overflow — so every op that takes an mx::Shape surfaces the clean error, not just the three creation ops:

>>> mx.reshape(a, [2**31])
ValueError: Shape dimension 2147483648 is outside the supported range [-2147483648, 2147483647]. ...
>>> mx.broadcast_to(a, [2**31, 1])
ValueError: Shape dimension 2147483648 is outside the supported range ...

One wrinkle — because the caster now throws, it can't live inside a std::variant: nanobind's variant caster is noexcept and std::terminate's on any escaping exception (verified locally). So I split zeros/ones/full into two nb::def overloads each (scalar int64_t + mx::Shape) instead of variant<int, mx::Shape>. The scalar overload still throws via check_shape_dim for mx.zeros(2**31).

Test coverage broadened to reshape / broadcast_to / negative overflow. Full test_ops.TestOps (139 tests) still passes locally.

zcbenz

Very nice fix, thanks!

zcbenz · 2026-04-23T23:48:42Z

Can you fix the lint error?

serenposh · 2026-04-23T23:57:22Z

@zcbenz fixed in b3d7605. The failure was just clang-format rewrapping in python/src/convert.cpp and python/src/small_vector.h; I pushed the formatting-only fix and the checks should rerun now.

serenposh · 2026-04-24T00:49:19Z

I tracked the failing CPU/Windows jobs to half-precision mean() reducing in half precision. The latest commit, e9fcdaf, promotes float16/bfloat16 reductions to float32 inside mean(), and the previously failing local CPU random tests now pass again: test random uniform and test random normal.If you get a chance, could you please take another look and re-approve if everything looks good on your side?

zcbenz · 2026-04-24T02:32:17Z

Which failing test do you mean? I only saw this failing test in CI:

  ======================================================================
  ERROR: test_array_np_shape_dim_check (test_array.TestArray.test_array_np_shape_dim_check)
  ----------------------------------------------------------------------
  Traceback (most recent call last):
    File "D:\a\mlx\mlx\python\tests\test_array.py", line 771, in test_array_np_shape_dim_check
      mx.array(a_npy)
      ~~~~~~~~^^^^^^^
  OverflowError: Shape dimension 2147483648 is outside the supported range [-2147483648, 2147483647]. MLX currently uses 32-bit integers for shape dimensions.
  
  ----------------------------------------------------------------------

zcbenz requested changes Apr 19, 2026

View reviewed changes

zcbenz reviewed Apr 20, 2026

View reviewed changes

Comment thread python/src/ops.cpp Outdated

Comment thread python/src/small_vector.h Outdated

Comment thread python/src/small_vector.h Outdated

Comment thread python/src/small_vector.h Outdated

Address overflow review comments

58edd29

zcbenz approved these changes Apr 23, 2026

View reviewed changes

zcbenz changed the title ~~Clearer error when shape dimension overflows int32 (#2681)~~ Clearer error when shape dimension overflows int32 Apr 23, 2026

Fix clang-format lint failure

b3d7605

Fix half-precision mean reduction

e9fcdaf

serenposh requested a review from zcbenz April 24, 2026 00:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clearer error when shape dimension overflows int32#3425

Clearer error when shape dimension overflows int32#3425
serenposh wants to merge 5 commits intoml-explore:mainfrom
serenposh:claude/amazing-haslett-83c10f

serenposh commented Apr 18, 2026

Uh oh!

zcbenz left a comment

Uh oh!

serenposh commented Apr 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zcbenz left a comment

Uh oh!

zcbenz commented Apr 23, 2026

Uh oh!

serenposh commented Apr 23, 2026

Uh oh!

serenposh commented Apr 24, 2026 •

edited

Loading

Uh oh!

zcbenz commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

serenposh commented Apr 18, 2026

Summary

Changes

Scope

Test plan

Uh oh!

zcbenz left a comment

Choose a reason for hiding this comment

Uh oh!

serenposh commented Apr 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zcbenz left a comment

Choose a reason for hiding this comment

Uh oh!

zcbenz commented Apr 23, 2026

Uh oh!

serenposh commented Apr 23, 2026

Uh oh!

serenposh commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zcbenz commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

serenposh commented Apr 24, 2026 •

edited

Loading