Commit 643894e
Vectorize requantize_ for Arm64 with NEON intrinsics (#5130)
Summary:
Pull Request resolved: #5130
X-link: https://github.com/facebookresearch/FBGEMM/pull/2132
This change added a vectorized requantize_ for Arm64 with NEON intrinsics:
1. The newly added NEON intrinsics follows what the existing AVX2 code does.
2. The scalar loop was moved to a new function requantize_i8dw_ref_ to make the code more readable and testable.
3. Added new tests to make sure requantize_ and requantize_i8dw_ref_ produce identical results.
Reviewed By: Nicoshev
Differential Revision: D86216347
fbshipit-source-id: 1e61ec1bd4ea6ff571de96c59b0191e0822588781 parent 47e79da commit 643894e
File tree
6 files changed
+763
-54
lines changed- src
- test
6 files changed
+763
-54
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
12 | 14 | | |
13 | 15 | | |
14 | 16 | | |
15 | 17 | | |
16 | 18 | | |
17 | | - | |
18 | | - | |
19 | 19 | | |
20 | | - | |
21 | | - | |
22 | 20 | | |
23 | 21 | | |
24 | 22 | | |
25 | 23 | | |
26 | 24 | | |
27 | 25 | | |
28 | | - | |
29 | | - | |
30 | 26 | | |
31 | 27 | | |
32 | 28 | | |
| |||
47 | 43 | | |
48 | 44 | | |
49 | 45 | | |
| 46 | + | |
| 47 | + | |
50 | 48 | | |
51 | 49 | | |
52 | 50 | | |
| |||
73 | 71 | | |
74 | 72 | | |
75 | 73 | | |
76 | | - | |
77 | 74 | | |
78 | 75 | | |
79 | 76 | | |
| |||
502 | 499 | | |
503 | 500 | | |
504 | 501 | | |
| 502 | + | |
505 | 503 | | |
506 | | - | |
507 | | - | |
508 | | - | |
509 | | - | |
510 | | - | |
511 | | - | |
512 | | - | |
513 | | - | |
514 | | - | |
515 | | - | |
516 | | - | |
517 | | - | |
518 | | - | |
519 | | - | |
520 | | - | |
521 | | - | |
522 | | - | |
523 | | - | |
524 | | - | |
525 | | - | |
526 | | - | |
527 | | - | |
528 | | - | |
529 | | - | |
530 | | - | |
531 | | - | |
532 | | - | |
533 | | - | |
534 | | - | |
535 | | - | |
536 | | - | |
537 | | - | |
538 | | - | |
539 | | - | |
540 | | - | |
541 | | - | |
542 | | - | |
543 | | - | |
544 | | - | |
545 | | - | |
546 | | - | |
547 | | - | |
548 | | - | |
549 | | - | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
550 | 524 | | |
551 | 525 | | |
552 | 526 | | |
| 527 | + | |
| 528 | + | |
0 commit comments