Skip size calculation during async copy wait=True #1088

rupeng-liu · 2025-11-13T00:51:15Z

Thanks to @bythew3i (Jevin) for providing the insight for the optimization. This is to skip size computation during DMA wait.

I have ran both kernel's test and brought up e2e vllm server to test, will provide perf improvement numbers

Signed-off-by: Rupeng Liu rupengliu@meta.com

Description

Start with a short description of what the PR does and how this is a change from
the past.

The rest of the description includes relevant details and context, examples:

why is this change being made,
the problem being solved and any relevant context,
why this is a good solution,
some information about the specific implementation,
shortcomings of the solution and possible future improvements.

If the change fixes a bug or a Github issue, please include a link, e.g.,:
FIXES: b/123456
FIXES: #123456

Tests

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Checklist

Before submitting this PR, please make sure:

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have made or will make corresponding changes to any relevant documentation.

Signed-off-by: Rupeng Liu rupengliu@meta.com

rupengliu-meta · 2025-11-14T00:35:57Z

around 5-10% throughput improvement

bythew3i · 2025-11-19T01:17:15Z

tpu_inference/kernels/ragged_paged_attention/v3/kernel.py

-                sem,
-                wait,
+                src=vmem_ref,
+                dst=vmem_ref.at[pl.ds(0, offset + bkv_sz_frm_new)],


wait only cares about size.

Please refer to https://github.com/vllm-project/tpu-inference/pull/718/files

thankjs Jevin, please check #1126

rupengliu-meta · 2025-11-19T03:33:23Z

moved to a new github account, close this PR for now

Skip size calculation during async copy wait=True

df2306e

Signed-off-by: Rupeng Liu rupengliu@meta.com

bythew3i reviewed Nov 19, 2025

View reviewed changes

rupeng-liu closed this Nov 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Skip size calculation during async copy wait=True #1088

Skip size calculation during async copy wait=True #1088

Uh oh!

rupeng-liu commented Nov 13, 2025

Uh oh!

rupengliu-meta commented Nov 14, 2025

Uh oh!

bythew3i Nov 19, 2025

Uh oh!

rupengliu-meta Nov 19, 2025

Uh oh!

rupengliu-meta commented Nov 19, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Skip size calculation during async copy wait=True #1088

Skip size calculation during async copy wait=True #1088

Uh oh!

Conversation

rupeng-liu commented Nov 13, 2025

Description

Tests

Checklist

Uh oh!

rupengliu-meta commented Nov 14, 2025

Uh oh!

bythew3i Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

rupengliu-meta Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

rupengliu-meta commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rupengliu-meta commented Nov 19, 2025 •

edited

Loading