Skip to content

Conversation

@geesun
Copy link
Contributor

@geesun geesun commented Nov 7, 2025

Add an article for How to Benchmark a Single KleidiAI Micro-kernel in ExecuTorch

It includes the following:

  • Cross-compile ExecuTorch for the ARM64 platform, enabling XNNPACK and KleidiAI with SME2 support.

  • Create ExecuTorch models that can be accelerated by SME2 through KleidiAI.

  • Use the executor_runner tool to generate ETDump profiling data.

  • Analyze the contents of ETRecord and ETDump using the ExecuTorch Inspector API.

  • I have reviewed Create a Learning Path

Please do not include any confidential information in your contribution. This includes confidential microarchitecture details and unannounced product information.

  • I have checked my contribution for confidential information

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the Creative Commons Attribution 4.0 International License.

@geesun geesun force-pushed the kai-performance branch 2 times, most recently from 19acc01 to 674d462 Compare November 10, 2025 02:36
@geesun geesun changed the title Add How to Measure Kleidai Kernel Performance in ExecuTorch Add How to Benchmark a Single KleidiAI Micro-kernel in ExecuTorch Nov 10, 2025
@geesun geesun force-pushed the kai-performance branch 4 times, most recently from 0080950 to 9e4902d Compare November 12, 2025 02:15
@pareenaverma
Copy link
Contributor

merging into main for tech review

@pareenaverma pareenaverma merged commit 764cba4 into ArmDeveloperEcosystem:main Nov 18, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants