Skip to content

[ET-VK] Add conv1d shaders and dispatch for 1D convolution#18060

Open
SS-JIA wants to merge 1 commit intogh/SS-JIA/477/basefrom
gh/SS-JIA/477/head
Open

[ET-VK] Add conv1d shaders and dispatch for 1D convolution#18060
SS-JIA wants to merge 1 commit intogh/SS-JIA/477/basefrom
gh/SS-JIA/477/head

Conversation

@SS-JIA
Copy link
Contributor

@SS-JIA SS-JIA commented Mar 10, 2026

Stack from ghstack (oldest at bottom):

Add dedicated GLSL shaders and C++ dispatch for 1D convolution operations.
The new implementation introduces three execution paths:

  1. Buffer path (general and depthwise): conv1d.glsl and conv1d_dw.glsl operate
    on width-packed buffer tensors, with one shader invocation per output element
    at (n, out_c, out_l). Uses sizes_ubo/strides_ubo and tidx_to_bufi() for
    layout-agnostic index computation.

  2. Buffer path (pointwise): conv1d_pw.glsl specializes the kernel_size=1 case
    to skip the spatial loop for efficiency.

  3. Texture path (pointwise): conv1d_pw_texture.glsl handles the pointwise case
    for width-packed texture3d tensors, computing one output texel (4 values)
    per invocation using a scalar weight from a buffer.

The legacy conv1d_texture.glsl (renamed from conv1d.glsl) preserves the
original channels-packed texture path for backward compatibility.

Convolution.cpp is updated to route 1D convolutions to the appropriate
specialized dispatch (add_conv1d_buf_node or add_conv1d_pw_texture_node)
based on the input storage type and packed dim, falling back to the legacy
texture path for channels-packed inputs.

op_registry.py gains a pick_conv_storage function that selects:

  • WIDTH_PACKED_TEXTURE for pointwise 1D conv (kernel_size=1)
  • CONTIGUOUS_BUFFER for non-pointwise 1D conv
  • CHANNELS_PACKED_TEXTURE for 2D conv (unchanged)

Differential Revision: D95970166

Add dedicated GLSL shaders and C++ dispatch for 1D convolution operations.
The new implementation introduces three execution paths:

1. Buffer path (general and depthwise): conv1d.glsl and conv1d_dw.glsl operate
   on width-packed buffer tensors, with one shader invocation per output element
   at (n, out_c, out_l). Uses sizes_ubo/strides_ubo and tidx_to_bufi() for
   layout-agnostic index computation.

2. Buffer path (pointwise): conv1d_pw.glsl specializes the kernel_size=1 case
   to skip the spatial loop for efficiency.

3. Texture path (pointwise): conv1d_pw_texture.glsl handles the pointwise case
   for width-packed texture3d tensors, computing one output texel (4 values)
   per invocation using a scalar weight from a buffer.

The legacy conv1d_texture.glsl (renamed from conv1d.glsl) preserves the
original channels-packed texture path for backward compatibility.

Convolution.cpp is updated to route 1D convolutions to the appropriate
specialized dispatch (add_conv1d_buf_node or add_conv1d_pw_texture_node)
based on the input storage type and packed dim, falling back to the legacy
texture path for channels-packed inputs.

op_registry.py gains a pick_conv_storage function that selects:
- WIDTH_PACKED_TEXTURE for pointwise 1D conv (kernel_size=1)
- CONTIGUOUS_BUFFER for non-pointwise 1D conv
- CHANNELS_PACKED_TEXTURE for 2D conv (unchanged)

Differential Revision: [D95970166](https://our.internmc.facebook.com/intern/diff/D95970166/)

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 10, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18060

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures

As of commit c8d484d with merge base f09bd55 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant