Vectorize requantize_ for Arm64 with NEON intrinsics #5130

mcfi · 2025-11-14T02:18:16Z

Summary:
This change added a vectorized requantize_ for Arm64 with NEON intrinsics:

The newly added NEON intrinsics follows what the existing AVX2 code does.
The scalar loop was moved to a new function requantize_i8dw_ref_ to make the code more readable and testable.
Added new tests to make requantize_ and requantize_i8dw_ref_ produce identical results.

Differential Revision: D86216347

meta-codesync · 2025-11-14T02:18:55Z

@mcfi has exported this pull request. If you are a Meta employee, you can view the originating Diff in D86216347.

Summary: Pull Request resolved: pytorch#5130 X-link: https://github.com/facebookresearch/FBGEMM/pull/2132 This change added a vectorized requantize_ for Arm64 with NEON intrinsics: 1. The newly added NEON intrinsics follows what the existing AVX2 code does. 2. The scalar loop was moved to a new function requantize_i8dw_ref_ to make the code more readable and testable. 3. Added new tests to make requantize_ and requantize_i8dw_ref_ produce identical results. Differential Revision: D86216347

Summary: Pull Request resolved: pytorch#5130 X-link: https://github.com/facebookresearch/FBGEMM/pull/2132 This change added a vectorized requantize_ for Arm64 with NEON intrinsics: 1. The newly added NEON intrinsics follows what the existing AVX2 code does. 2. The scalar loop was moved to a new function requantize_i8dw_ref_ to make the code more readable and testable. 3. Added new tests to make sure requantize_ and requantize_i8dw_ref_ produce identical results. Reviewed By: Nicoshev Differential Revision: D86216347

meta-codesync · 2025-11-19T11:21:31Z

This pull request has been merged in 643894e.

meta-cla bot added the cla signed label Nov 14, 2025

meta-codesync bot added fb-exported meta-exported labels Nov 14, 2025

mcfi force-pushed the export-D86216347 branch from a4ec8c4 to fe324d6 Compare November 14, 2025 22:42

mcfi force-pushed the export-D86216347 branch from fe324d6 to a2e3ba0 Compare November 14, 2025 23:46

mcfi force-pushed the export-D86216347 branch from a2e3ba0 to 35855d6 Compare November 14, 2025 23:52

mcfi force-pushed the export-D86216347 branch from 35855d6 to a4f2e8e Compare November 14, 2025 23:57

mcfi force-pushed the export-D86216347 branch from a4f2e8e to b557c97 Compare November 18, 2025 22:15

meta-codesync bot closed this in 643894e Nov 19, 2025

facebook-github-bot added the Merged label Nov 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vectorize requantize_ for Arm64 with NEON intrinsics #5130

Vectorize requantize_ for Arm64 with NEON intrinsics #5130

Uh oh!

mcfi commented Nov 14, 2025

Uh oh!

meta-codesync bot commented Nov 14, 2025

Uh oh!

meta-codesync bot commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Vectorize requantize_ for Arm64 with NEON intrinsics #5130

Vectorize requantize_ for Arm64 with NEON intrinsics #5130

Uh oh!

Conversation

mcfi commented Nov 14, 2025

Uh oh!

meta-codesync bot commented Nov 14, 2025

Uh oh!

meta-codesync bot commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants