[5951713] Fix benchmark allocation failure by willg-nv · Pull Request #978 · NVIDIA/Model-Optimizer

willg-nv · 2026-03-05T02:58:22Z

What does this PR do?

[modelopt][onnx] - ERROR - Benchmark failed: Converting dtype('float16') to a ctypes type
Traceback (most recent call last):
...
    raise NotImplementedError(
NotImplementedError: Converting dtype('float16') to a ctypes type

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, using torch.load(..., weights_only=True), avoiding pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other source, did you follow IP policy in CONTRIBUTING.md?: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Summary by CodeRabbit

Release Notes

Bug Fixes
- Improved dtype handling robustness in host memory allocation to avoid failures for uncommon numeric types.
- Added fallback support for 2-byte floating-point formats (float16, bfloat16); clearer errors now raised when a dtype is unsupported.

Signed-off-by: Will Guo <willg@nvidia.com>

copy-pr-bot · 2026-03-05T02:58:26Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-03-05T02:58:36Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3287b5fa-98e6-430e-a6d8-68244fbe7a7d

📥 Commits

Reviewing files that changed from the base of the PR and between 06ce30b and 7345b99.

📒 Files selected for processing (1)

modelopt/onnx/quantization/autotune/benchmark.py

🚧 Files skipped from review as they are similar to previous changes (1)

modelopt/onnx/quantization/autotune/benchmark.py

📝 Walkthrough

Walkthrough

Normalize dtype to np.dtype in _alloc_pinned_host, attempt a ctypes-based view of allocated host memory, fall back to allocating as uint16 and viewing as the requested 2-byte dtype when ctypes mapping is missing, and raise TypeError if neither is possible; return path unchanged on success.

Changes

Cohort / File(s)	Summary
Dtype / Memory-allocation logic `modelopt/onnx/quantization/autotune/benchmark.py`	Explicitly convert `dtype` to `np.dtype` before computing `itemsize`; try to create a ctypes-based view of allocated host memory; if no ctypes mapping for 2-byte float types, allocate as `uint16` and view as requested dtype; raise `TypeError` when unsupported. Highlights: ctypes fallback for 2-byte types, unchanged success/error return behavior.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly identifies the specific issue being fixed (benchmark allocation failure) and matches the core change in the changeset (fixing NotImplementedError in memory allocation).
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Security Anti-Patterns	✅ Passed	The pull request introduces no new security anti-patterns. Modifications to benchmark.py are limited to the _alloc_pinned_host function, adding dtype normalization and ctypes-based fallback for 2-byte float types, with no involvement of critical security patterns.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

willg-nv · 2026-03-05T02:58:41Z

@gcunhase please help review this PR.

modelopt/onnx/quantization/autotune/benchmark.py

gcunhase · 2026-03-05T14:58:06Z

/ok to test 06ce30b

codecov · 2026-03-05T17:35:45Z

Codecov Report

❌ Patch coverage is 0% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.08%. Comparing base (a34d613) to head (7345b99).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
modelopt/onnx/quantization/autotune/benchmark.py	0.00%	10 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #978      +/-   ##
==========================================
- Coverage   72.10%   72.08%   -0.02%     
==========================================
  Files         209      209              
  Lines       23628    23638      +10     
==========================================
+ Hits        17036    17040       +4     
- Misses       6592     6598       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Will Guo <willg@nvidia.com>

gcunhase · 2026-03-06T19:40:12Z

/ok to test 7345b99

Fix benchmark allocation failure

06ce30b

Signed-off-by: Will Guo <willg@nvidia.com>

willg-nv requested a review from a team as a code owner March 5, 2026 02:58

willg-nv requested a review from cjluo-nv March 5, 2026 02:58

gcunhase changed the title ~~Fix benchmark allocation failure~~ [5951713] Fix benchmark allocation failure Mar 5, 2026

gcunhase reviewed Mar 5, 2026

View reviewed changes

modelopt/onnx/quantization/autotune/benchmark.py Outdated Show resolved Hide resolved

gcunhase mentioned this pull request Mar 5, 2026

[OMNIML-3252][ONNX] Add real Q/DQ scales in Autotune #951

Open

resolve comments

7345b99

Signed-off-by: Will Guo <willg@nvidia.com>

gcunhase approved these changes Mar 6, 2026

View reviewed changes

gcunhase enabled auto-merge (squash) March 6, 2026 19:40

gcunhase merged commit be6dfad into NVIDIA:main Mar 6, 2026
40 checks passed

gcunhase mentioned this pull request Mar 6, 2026

[ONNX][Autotune] Replace CUDA memory management from CUDART to PyTorch #998

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[5951713] Fix benchmark allocation failure#978

[5951713] Fix benchmark allocation failure#978
gcunhase merged 2 commits intoNVIDIA:mainfrom
willg-nv:dev-willg-fix-benchmark-bug

willg-nv commented Mar 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

willg-nv commented Mar 5, 2026

Uh oh!

Uh oh!

gcunhase commented Mar 5, 2026

Uh oh!

codecov bot commented Mar 5, 2026 •

edited

Loading

Uh oh!

gcunhase commented Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

willg-nv commented Mar 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

willg-nv commented Mar 5, 2026

Uh oh!

Uh oh!

gcunhase commented Mar 5, 2026

Uh oh!

codecov bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gcunhase commented Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

willg-nv commented Mar 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 5, 2026 •

edited

Loading

codecov bot commented Mar 5, 2026 •

edited

Loading