Skip to content

[QNN] Deployment issues with Gemma 3 4B on QCS8550 via executorch-examples #18113

@EdisonPeee

Description

@EdisonPeee

Environment
SoC: Qualcomm QCS8550

Device OS: Android 13

QNN SDK: 2.42

Host OS: Ubuntu 22.04

Source: Latest executorch and executorch-examples repositories.

Problem Summary
I am currently using the executorch-examples (LlamaDemo) to evaluate LLM performance on the QCS8550 platform. While I can successfully run Llama 3.2 1B on the QNN backend, I am facing significant issues when attempting to deploy Gemma 3 4B.

Model Export Issue: When using the provided export tools to convert Gemma 3 4B for the QNN backend, the process fails with a TypeError: 'NoneType' object is not callable during the recipe application stage. It seems the QNN-specific export path for Gemma 3 is not yet fully integrated or requires specific configurations not documented in the examples.

Runtime Crash (Method Mismatch): I tried to use a pre-exported Gemma 3 4B .pte model on the Android LlamaDemo app. However, the app crashes during model loading with the following error:
ExecuTorch E No method named 'kv_forward' in program
It appears the Llama-based runner is hardcoded to look for kv_forward, while Gemma 3/VLM models might be using a different entry point (e.g., forward).

Questions
Does the current QNN delegate and the Android LlamaDemo example officially support Gemma 3 4B, or is the QNN support currently limited to the Llama/Qwen families?

For Multimodal models like Gemma 3, is there a recommended way to handle the method name mismatch in the Android runner when using the QNN backend?

Are there specific export flags or recipes required to successfully target the HTP (NPU) on QCS8550 for Gemma 3 models?

I would appreciate any insights or recommended commits that stabilize Gemma 3 support for Qualcomm QNN.

cc @cccclai @winskuo-quic @shewu-quic @haowhsu-quic @DannyYuyang-quic @cbilgin

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: qnnIssues related to Qualcomm's QNN delegate and code under backends/qualcomm/partner: qualcommFor backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions