Refactor pagedAttention transpose #33102

mangguo321 · 2025-12-03T03:31:02Z

Details:

Move transpose functions from executor_pa.cpp to transpose.hpp to reuse in xattention and executor_pa.cpp. Modify transpose_16NxK logic to handle tails

Tickets:

CVS-177312

zhangYiIntel

LGTM

mangguo321 · 2025-12-09T02:41:36Z

Test on EMR, no regression in performance and accuracy.

Copilot

Pull request overview

This PR refactors transpose functions from executor_pa.cpp to a shared transpose.hpp header for reuse across multiple components. The changes enable better code organization and add new parameters to the transpose function signature to support quantization features.

Moved three transpose_16NxK template overloads from executor_pa.cpp to transpose.hpp
Updated function signatures to include tmp, group_size, and quant_key_bychannel parameters
Modified all call sites to pass additional parameters (including nullptr for unused tmp parameter)

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/transpose.hpp	Added three `transpose_16NxK` template overloads moved from executor_pa.cpp, including support for quantized types (i8, u8, u4)
src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/xattention.hpp	Updated calls to `transpose_16NxK` to include new parameters (nullptr for tmp, 0 for group_size, false for quant_key_bychannel)
src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/executor_pa.cpp	Removed `transpose_16NxK` function definitions that were moved to transpose.hpp, added include for transpose.hpp

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-11T08:18:09Z

src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/transpose.hpp

-    transpose_16NxK<uint32_t, ov::element::u32>(d, s, N, K >> 1, block_size, dst_stride, src_stride >> 1);
+    transpose_16NxK<uint32_t, ov::element::u32>(d,
+                                                s,
+                                                reinterpret_cast<uint32_t*>(0),


Using reinterpret_cast<uint32_t*>(0) to represent nullptr is non-idiomatic and less clear. Replace with nullptr or static_cast<uint32_t*>(nullptr) for better readability.

Suggested change

reinterpret_cast<uint32_t*>(0),

nullptr,

This is a valid comment.
@mangguo321 , could you please explicitly give your opinion on this?

This is a valid comment. @mangguo321 , could you please explicitly give your opinion on this?

Hi @maxnick, this code was originally implemented in executor_pa.cpp and moved here without modification. The original intent is unclear, but I think we can update it to use nullptr for now.

Copilot · 2025-12-11T08:18:10Z

src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/transpose.hpp

+    }
+    transpose_16NxK<TDST, precision_of<TDST>::value>(dst,
+                                                     tmp,
+                                                     reinterpret_cast<TDST*>(0),


Using reinterpret_cast<TDST*>(0) to represent nullptr is non-idiomatic and less clear. Replace with nullptr for better readability.

Suggested change

reinterpret_cast<TDST*>(0),

nullptr,

Replaced with nullptr.

praasz · 2025-12-12T07:39:28Z

src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/transpose.hpp

+            attn_dequant_by_channel_kernel<TDST,
+                                           SRC_PREC>(s, t, N, K, K / sub_byte_multiplier, src_stride, p_scales, p_zps);
+        } else {
+            static_assert(SRC_PREC == ov::element::i8, "i8 doesn't support by-channel quantization");


It fails for types different than i8, but error message suggest that i8 is not correct.
```should the condition be SRC_PREC != ov::element::i8?

Yes, you are right. I've fixed in the latest commit, thanks!

Refactor pagedAttention transpose

4728857

mangguo321 requested review from a team as code owners December 3, 2025 03:31

github-actions bot added the category: CPU OpenVINO CPU plugin label Dec 3, 2025

mangguo321 requested a review from zhangYiIntel December 5, 2025 01:16

zhangYiIntel approved these changes Dec 5, 2025

View reviewed changes

yuxu42 requested a review from Copilot December 11, 2025 08:17

Copilot AI reviewed Dec 11, 2025

View reviewed changes

maxnick self-assigned this Dec 11, 2025

maxnick added this to the 2026.0 milestone Dec 11, 2025

Replace reinterpret_cast with nullptr

ad8a7a8

praasz reviewed Dec 12, 2025

View reviewed changes

mangguo321 added 2 commits December 12, 2025 15:59

Fix static_assert condition

ed5de5b

Replace static_assert with OPENVINO_THROW ot avoid compile error

f4e696a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor pagedAttention transpose #33102

Refactor pagedAttention transpose #33102

mangguo321 commented Dec 3, 2025

Uh oh!

zhangYiIntel left a comment

Uh oh!

mangguo321 commented Dec 9, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 11, 2025

Uh oh!

maxnick Dec 11, 2025

Uh oh!

mangguo321 Dec 12, 2025

Uh oh!

Copilot AI Dec 11, 2025

Uh oh!

mangguo321 Dec 12, 2025

Uh oh!

praasz Dec 12, 2025

Uh oh!

mangguo321 Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Refactor pagedAttention transpose #33102

Are you sure you want to change the base?

Refactor pagedAttention transpose #33102

Conversation

mangguo321 commented Dec 3, 2025

Details:

Tickets:

Uh oh!

zhangYiIntel left a comment

Choose a reason for hiding this comment

Uh oh!

mangguo321 commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

maxnick Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

mangguo321 Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

mangguo321 Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

praasz Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

mangguo321 Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mangguo321 commented Dec 9, 2025 •

edited

Loading