Matmul: Switch to trailing batch dims and allow mat-vec, vec-mat by AntonOresten · Pull Request #132 · JuliaGPU/cuTile.jl

AntonOresten · 2026-03-20T17:46:02Z

Closes #115

When operands have different numbers of batch dimensions (e.g. (M, K, 4) * (K, N, 2, 4)), _matmul pads the shorter batch tuple with ones to align them before computing the output shape and creating the zero accumulator. _muladd does the same padding to reshape operands before broadcasting. These two functions disagreed on *where* to pad: _matmul inserted leading ones ((1, 4) for a 1-batch operand against a 2-batch one) while _muladd appended trailing ones ((4, 1)). This meant the acc shape from _matmul wouldn't match what _muladd expected, causing a reshape element-count mismatch at the Tile IR level. Fix _matmul to use trailing ones, consistent with _muladd. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

maleadt

Let's hope the additional permutes here are free?

AntonOresten · 2026-03-25T09:25:30Z

Hopefully🤞
A month or two ago I defined a batched mul helper for tiles, explicitly writing out the broadcasting and contractions with the old ct.reduce_sum, and surprisingly 8x8 performed way better than 4x4, I assume because the compiler noticed it could start using tensor cores. Hard to imagine it not being robust to some extra permutations. We're already permuting in e.g. reshape.

AntonOresten and others added 2 commits March 25, 2026 09:02

switch to trailing batches; allow mat-vec, vec-mat

72c5768

Update FFT example.

a0049f9

maleadt force-pushed the matmul-shapes branch from d21a869 to a0049f9 Compare March 25, 2026 08:04

maleadt approved these changes Mar 25, 2026

View reviewed changes

maleadt merged commit caf0027 into JuliaGPU:main Mar 25, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matmul: Switch to trailing batch dims and allow mat-vec, vec-mat#132

Matmul: Switch to trailing batch dims and allow mat-vec, vec-mat#132
maleadt merged 3 commits intoJuliaGPU:mainfrom
AntonOresten:matmul-shapes

AntonOresten commented Mar 20, 2026

Uh oh!

maleadt left a comment

Uh oh!

Uh oh!

AntonOresten commented Mar 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AntonOresten commented Mar 20, 2026

Uh oh!

maleadt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AntonOresten commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AntonOresten commented Mar 25, 2026 •

edited

Loading