Skip to content

feat(perf): add --runtime [ort|openvino] to compare ORT vs OpenVINO#960

Open
xieofxie wants to merge 3 commits into
mainfrom
hualxie/run_ov
Open

feat(perf): add --runtime [ort|openvino] to compare ORT vs OpenVINO#960
xieofxie wants to merge 3 commits into
mainfrom
hualxie/run_ov

Conversation

@xieofxie

Copy link
Copy Markdown
Contributor

What

Adds a --runtime [ort|openvino] flag to winml perf so the same ONNX file can be benchmarked on ONNX Runtime vs OpenVINO Runtime for a side-by-side comparison.

winml perf -m model.onnx --runtime openvino --device gpu
winml perf -m model.onnx --runtime ort --ep cpu   # ORT-native baseline
  • Default is ort — existing behavior is unchanged.
  • --runtime openvino reads the raw ONNX directly via OpenVINO Runtime (no quantize/optimize/compile build), which is the fair, simple comparison on the same graph. ONNX input only.

How

  • OpenVINOSession (session/openvino/openvino_session.py) mirrors the subset of WinMLSession the perf engine uses — compile() / run() / perf() plus io_config / device / ep_name / running_model_path. It reuses get_io_config, load_onnx, PerfStats, and WinMLSession._get_precision, so I/O metadata and reports match the ORT path. No model-specific logic.
  • _OpenVINOModel adapter in perf.py exposes the _single surface the benchmark engine reads, so _run_single / _run_benchmark* / reporting are untouched. PerfBenchmark._load_model() branches to it and skips WinMLAutoModel + ORT EP resolution entirely (OpenVINO is independent of ORT's EPs).
  • --device maps cpu/gpu/npu/auto → OpenVINO CPU/GPU/NPU/AUTO. compile() fails fast against Core().available_devices with a readable message instead of a raw backend stack trace.
  • RuntimeName Literal + RUNTIME_NAMES in constants.py (mirrors CompilerName) — the CLI choice list and the typed config field derive from one source.
  • CLI guards: --runtime openvino requires a .onnx input and rejects --module.

Verified locally

  • --runtime openvino runs on CPU and GPU end-to-end; latency/throughput populated.
  • --monitor works on CPU and GPU (HW utilization via PDH; falls back to NullEPMonitor like most EPs — no OV-specific ep_proof telemetry yet).
  • Absent device (NPU here) → friendly error: OpenVINO device 'NPU' ... is not available. OpenVINO sees: ['CPU', 'GPU'].
  • New unit tests (tests/unit/session/test_openvino_session.py, gated on importorskip("openvino")) + CLI guard tests; all existing perf tests pass; ruff clean.

Notes / follow-ups

  • --ep and quant/optimize flags are intentional no-ops under --runtime openvino (raw ONNX) — documented in the flag help.
  • On machines where the WinML registry installs the OpenVINO EP, --runtime ort --device cpu already routes ORT→OpenVINO EP; use --runtime ort --ep cpu for a true ORT-native baseline.
  • Possible follow-up: surface EXECUTION_DEVICES in the report so AUTO-mode fallbacks are visible.

Closes #948

🤖 Generated with Claude Code

xieofxie added 2 commits June 24, 2026 16:04
- Add RuntimeName Literal + RUNTIME_NAMES to constants (mirrors CompilerName),
  thread it through BenchmarkConfig and the perf CLI instead of bare str.
- Fail fast in OpenVINOSession.compile() when the requested device is absent
  from Core().available_devices, with a readable message instead of a raw
  backend stack trace. AUTO is exempt; matches plain (GPU) and indexed (GPU.0)
  device names.
- Add a hardware-independent unit test for the unavailable-device path.
@xieofxie xieofxie requested a review from a team as a code owner June 24, 2026 08:25
…ssing

Wrap the openvino import in OpenVINOSession.compile() so an absent package
raises a clear install hint (pip install winml-cli[openvino]) instead of a
bare ModuleNotFoundError. Add a unit test that simulates the missing module.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: implement openvino run in perf to compare ?

1 participant