[Relax][ONNX] Add ONNX Backend Tests for systematic frontend coverage by Aharrypotter · Pull Request #19515 · apache/tvm

Aharrypotter · 2026-05-06T18:11:05Z

Summary

Introduce a test runner that reuses the official ONNX Backend Test Suite to systematically verify the Relax ONNX importer. This complements the existing hand-written tests in test_frontend_onnx.py by providing spec-aligned coverage of standard ONNX operator semantics.

Towards #19505

Motivation

The existing test_frontend_onnx.py has 187 hand-written tests that validate TVM-specific importer behavior (parameter handling, name sanitization, dynamic shapes, Relax IR structure). However, it relies on ONNX Runtime as the reference and cannot systematically cover all edge cases defined in the ONNX specification.

The ONNX Backend Test Suite provides 1653+ node-level tests with protobuf reference inputs/outputs. It is the industry standard for validating ONNX importers/exporters (used by ONNX Runtime, TensorFlow, PyTorch). Reusing it gives Relax a living, upstream-aligned correctness baseline.

What this PR adds

tests/python/relax/test_frontend_onnx_backend.py — a backend adapter (TVMRelaxBackend) that implements the onnx.backend.base.Backend interface, wiring from_onnx() → DecomposeOpsForInference() → LegalizeOps() → tvm.compile() → VirtualMachine.

Coverage

72 operators with 388 test cases, all passing. Only operators where every ONNX node test passes are included — no xfail markers.

Operators not yet covered include: cast (exotic dtypes), reduce ops (edge cases), reshape/resize/attention (complex behavior), quantization, and several others with known importer gaps. These can be added incrementally as the importer improves.

Test results

388 passed, 3216 skipped (CUDA variants + operators not yet in allowlist), 0 failed, 0 xfailed

CI impact

New test file is not added to any existing CI test shard by default
Full suite (388 tests) is lightweight on CPU-only runners

Design decisions

Coexistence with existing tests: test_frontend_onnx.py remains unchanged. Backend tests cover standard ONNX semantics; hand-written tests continue to cover TVM-specific behavior (dynamic shapes, Relax IR structure, importer options).
Public API only: uses backend_test.include() with ^-anchored regex patterns. No access to private ONNX APIs.
No xfail: only include operators that fully pass. Uncovered operators are documented in code comments and this PR description. Follow-up PRs can expand coverage as importer gaps are fixed.
Prefix conflict handling: include() patterns use ^test_{op}(?:_.*)?(?:_cpu|_cuda)$, which can cause false matches when a short op name is a prefix of a longer one (e.g. log vs log_softmax). Affected ops (log, max, relu) are excluded until a more precise matching strategy is adopted.

…frontend coverage (apache#19505) Add a test harness that wraps the official ONNX Backend Test Suite (Node Tests) around the Relax ONNX importer. This gives systematic, spec-aligned coverage of 116 operators with 533 passing tests, replacing hand-written edge-case models with standardized protobuf test data. The runner follows the standard `onnx.backend.base.Backend` interface, using `from_onnx()` → `DecomposeOpsForInference()` → `LegalizeOps()` → `tvm.compile()` → `VirtualMachine` to execute each test case. Known failures are tracked via `xfail` by category (trig precision, quantization edge cases, dynamic split, etc.).

gemini-code-assist

Code Review

This pull request adds a systematic verification suite for the Relax ONNX importer using the official ONNX Backend Test Suite, including a new pytest marker and a backend adapter. Review feedback identifies a bug where tvm.runtime.Tensor is used instead of tvm.runtime.NDArray and a logic error in input mapping where initializers should be filtered out from the graph inputs to align with the ONNX test runner's behavior.

I am having trouble creating individual review comments. Click here to see my feedback.

tests/python/relax/test_frontend_onnx_backend.py (92)

tvm.runtime.Tensor is not a valid class in the TVM Python API. It should be tvm.runtime.NDArray (or tvm.nd.NDArray). Using the incorrect class name will result in an AttributeError when the test runner attempts to verify outputs.

        if isinstance(output, (tvm.runtime.NDArray, np.ndarray)):

tests/python/relax/test_frontend_onnx_backend.py (123)

In ONNX, model.graph.input includes both model inputs and initializers (which serve as default values). The ONNX backend test runner typically only provides positional values in the inputs list for elements that are not initializers. Mapping positional inputs to graph.input directly in run() will lead to an incorrect mapping if initializers are interspersed in the input list. Filtering graph_input_names here to exclude initializers ensures that the positional mapping in TVMRelaxBackendRep.run aligns with the test runner's behavior.

        initializer_names = {t.name for t in model.graph.initializer}
        graph_input_names = [inp.name for inp in model.graph.input if inp.name not in initializer_names]

Aharrypotter · 2026-05-06T18:17:39Z

cc @mshr-h @tlopex

Aharrypotter · 2026-05-06T18:22:56Z

Code Review 代码审查

This pull request adds a systematic verification suite for the Relax ONNX importer using the official ONNX Backend Test Suite, including a new pytest marker and a backend adapter. Review feedback identifies a bug where tvm.runtime.Tensor is used instead of tvm.runtime.NDArray and a logic error in input mapping where initializers should be filtered out from the graph inputs to align with the ONNX test runner's behavior.此拉取请求为 Relax ONNX 导入器添加了一个系统验证套件，使用官方 ONNX 后端测试套件，包括一个新的 pytest 标记和一个后端适配器。审查反馈指出了一个错误，即在输入映射中使用了 tvm.runtime.Tensor 而不是 tvm.runtime.NDArray ，并且在输入映射中存在逻辑错误，初始值应从图输入中过滤，以与 ONNX 测试运行器的行为保持一致。

I am having trouble creating individual review comments. Click here to see my feedback.
我在创建单独的审查评论时遇到问题。点击这里查看我的反馈。

Thanks for the review, Addressing both points:

1. tvm.runtime.Tensor vs tvm.runtime.NDArray

tvm.runtime.Tensor is a valid class in the current TVM Python API — it is defined in tvm.runtime._tensor.Tensor and inherits from tvm_ffi.core.Tensor. It is not an AttributeError risk.

>>> import tvm
>>> hasattr(tvm.runtime.Tensor)
True
>>> tvm.runtime.Tensor.__mro__
(<class "tvm.runtime._tensor.Tensor">, <class "tvm_ffi.core.Tensor">, ...)

2. Excluding initializers from graph_input_names

The ONNX Backend Test data does not use initializers — all inputs (including weights) are provided as .pb files. We verified this across multiple models (test_batchnorm_epsilon, test_conv, test_matmul_2d, test_add, test_gemm): model.graph.initializer is always empty.

The current positional mapping logic is correct for this test data format. That said, adding the initializer filter is a good defensive measure for robustness — if the test data format ever changes or a third-party test suite uses initializers, the filter would prevent silent misalignment.

mshr-h · 2026-05-07T07:12:53Z

Let's try to run the new tests in CI and see if they increase CI pressure. @Aharrypotter

Aharrypotter · 2026-05-07T14:22:43Z

@tvm-bot run slow tests

This PR introduces a test runner that reuses the official ONNX Backend Test suite to systematically cover relax.frontend.onnx. - Node-level test filtering via BackendTest._test_items - ONNX backend pytest marker - SKIP_SLOW_TESTS support - Documented xfails for known importer gaps

mshr-h · 2026-05-08T09:06:04Z

Just curious can we completely move to new backend tests or do we still need to maintain old ones?
We need to investigate if we can test ops such as Sequence, Attention, Quantization as they seems to be complicated.

tlopex · 2026-05-08T09:44:09Z

I think it's fine to move to new backend tests @mshr-h

Aharrypotter · 2026-05-08T17:20:23Z

Just curious can we completely move to new backend tests or do we still need to maintain old ones? We need to investigate if we can test ops such as Sequence, Attention, Quantization as they seems to be complicated.

I checked Sequence, Attention, and Quantization locally.

Quantization has a few passing cases, but enabling it cleanly would require very specific per-test include patterns. The broader QuantizeLinear / DequantizeLinear prefixes also pull in unsupported variants like blocked quantization, float8/float4, and int2/int4/uint2/uint4. So I think it is better to leave it for a follow-up PR.

Attention also needs separate work: the ONNX backend tests use the standard Q/K/V Attention form, while the current Relax converter seems to support the older Microsoft-style packed-QKV path with num_heads.

Sequence has similar issues, mostly around runtime sequence inputs, dynamic positions, SequenceMap/Loop, ReverseSequence, and SplitToSequence.

Given that, I would keep this PR focused on the initial stable subset and track these categories as follow-up items.

Aharrypotter · 2026-05-08T17:33:08Z

I think we can move in that direction, but probably not completely in one step.

The ONNX Backend Tests are very useful for standard operator semantic coverage, and they can replace some duplicated hand-written semantic tests over time. However, test_frontend_onnx.py also contains TVM-specific importer tests that the ONNX Backend Tests do not directly cover, such as keep_params_in_input, initializer/runtime parameter handling, input/name sanitization, symbolic/dynamic shape handling, Relax IR structure checks, validation/error paths, and importer-to-legalization/compile integration cases.

I agree that having two ONNX frontend test files could be confusing unless the boundary is clear.

My intended split is that test_frontend_onnx_backend.py covers standard ONNX operator semantics with the official backend tests, while test_frontend_onnx.py keeps TVM-specific importer behavior and integration/regression tests.

For this PR, I would keep the scope to landing the backend-test runner and the initial stable subset. Then, in a follow-up, we can audit test_frontend_onnx.py, migrate/delete duplicated semantic tests, and document where future ONNX frontend tests should go.

gemini-code-assist Bot reviewed May 6, 2026

View reviewed changes

Aharrypotter mentioned this pull request May 6, 2026

[Relax][ONNX] Use ONNX Backend Tests to improve frontend coverage #19505

Open

Aharrypotter force-pushed the relax-onnx-backend-tests branch from 9b3785e to a7b6c01 Compare May 7, 2026 14:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relax][ONNX] Add ONNX Backend Tests for systematic frontend coverage#19515

[Relax][ONNX] Add ONNX Backend Tests for systematic frontend coverage#19515
Aharrypotter wants to merge 2 commits intoapache:mainfrom
Aharrypotter:relax-onnx-backend-tests

Aharrypotter commented May 6, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Aharrypotter commented May 6, 2026

Uh oh!

Aharrypotter commented May 6, 2026

Code Review 代码审查

Uh oh!

mshr-h commented May 7, 2026 •

edited

Loading

Uh oh!

Aharrypotter commented May 7, 2026

Uh oh!

mshr-h commented May 8, 2026

Uh oh!

tlopex commented May 8, 2026

Uh oh!

Aharrypotter commented May 8, 2026

Uh oh!

Aharrypotter commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Aharrypotter commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

What this PR adds

Coverage

Test results

CI impact

Design decisions

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

tests/python/relax/test_frontend_onnx_backend.py (92)

tests/python/relax/test_frontend_onnx_backend.py (123)

Uh oh!

Aharrypotter commented May 6, 2026

Uh oh!

Aharrypotter commented May 6, 2026

Code Review 代码审查

Uh oh!

mshr-h commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Aharrypotter commented May 7, 2026

Uh oh!

mshr-h commented May 8, 2026

Uh oh!

tlopex commented May 8, 2026

Uh oh!

Aharrypotter commented May 8, 2026

Uh oh!

Aharrypotter commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Aharrypotter commented May 6, 2026 •

edited

Loading

mshr-h commented May 7, 2026 •

edited

Loading