feat(perf): add --runtime [ort|openvino] to compare ORT vs OpenVINO#960
Open
xieofxie wants to merge 3 commits into
Open
feat(perf): add --runtime [ort|openvino] to compare ORT vs OpenVINO#960xieofxie wants to merge 3 commits into
xieofxie wants to merge 3 commits into
Conversation
- Add RuntimeName Literal + RUNTIME_NAMES to constants (mirrors CompilerName), thread it through BenchmarkConfig and the perf CLI instead of bare str. - Fail fast in OpenVINOSession.compile() when the requested device is absent from Core().available_devices, with a readable message instead of a raw backend stack trace. AUTO is exempt; matches plain (GPU) and indexed (GPU.0) device names. - Add a hardware-independent unit test for the unavailable-device path.
…ssing Wrap the openvino import in OpenVINOSession.compile() so an absent package raises a clear install hint (pip install winml-cli[openvino]) instead of a bare ModuleNotFoundError. Add a unit test that simulates the missing module.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a
--runtime [ort|openvino]flag towinml perfso the same ONNX file can be benchmarked on ONNX Runtime vs OpenVINO Runtime for a side-by-side comparison.winml perf -m model.onnx --runtime openvino --device gpu winml perf -m model.onnx --runtime ort --ep cpu # ORT-native baselineort— existing behavior is unchanged.--runtime openvinoreads the raw ONNX directly via OpenVINO Runtime (no quantize/optimize/compile build), which is the fair, simple comparison on the same graph. ONNX input only.How
OpenVINOSession(session/openvino/openvino_session.py) mirrors the subset ofWinMLSessionthe perf engine uses —compile()/run()/perf()plusio_config/device/ep_name/running_model_path. It reusesget_io_config,load_onnx,PerfStats, andWinMLSession._get_precision, so I/O metadata and reports match the ORT path. No model-specific logic._OpenVINOModeladapter inperf.pyexposes the_singlesurface the benchmark engine reads, so_run_single/_run_benchmark*/ reporting are untouched.PerfBenchmark._load_model()branches to it and skipsWinMLAutoModel+ ORT EP resolution entirely (OpenVINO is independent of ORT's EPs).--devicemapscpu/gpu/npu/auto→ OpenVINOCPU/GPU/NPU/AUTO.compile()fails fast againstCore().available_deviceswith a readable message instead of a raw backend stack trace.RuntimeNameLiteral +RUNTIME_NAMESinconstants.py(mirrorsCompilerName) — the CLI choice list and the typed config field derive from one source.--runtime openvinorequires a.onnxinput and rejects--module.Verified locally
--runtime openvinoruns on CPU and GPU end-to-end; latency/throughput populated.--monitorworks on CPU and GPU (HW utilization via PDH; falls back toNullEPMonitorlike most EPs — no OV-specificep_prooftelemetry yet).OpenVINO device 'NPU' ... is not available. OpenVINO sees: ['CPU', 'GPU'].tests/unit/session/test_openvino_session.py, gated onimportorskip("openvino")) + CLI guard tests; all existing perf tests pass; ruff clean.Notes / follow-ups
--epand quant/optimize flags are intentional no-ops under--runtime openvino(raw ONNX) — documented in the flag help.--runtime ort --device cpualready routes ORT→OpenVINO EP; use--runtime ort --ep cpufor a true ORT-native baseline.EXECUTION_DEVICESin the report so AUTO-mode fallbacks are visible.Closes #948
🤖 Generated with Claude Code