Add Automated QDQ placement example - Part 4.1#841
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis pull request adds a timestamp to the log formatter in the ONNX logging configuration and introduces a comprehensive README for the ONNX autoqdq example, documenting quantization optimization with TensorRT. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
1b8d896 to
f36aa99
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@examples/qdq_placement/README.md`:
- Around line 191-192: Fix the typo in the README sentence about TensorRT remote
autotuning: change "autouning" to "autotuning" in the line that reads "TensorRT
10.16 support remote autotuning, pass remoteAutoTuningConfig to trtexec to
benchmark with remote autouning." to correctly spell "autotuning" and ensure the
sentence still reads clearly (e.g., "TensorRT 10.16 supports remote autotuning;
pass remoteAutoTuningConfig to trtexec to benchmark with remote autotuning.").
- Around line 128-129: The downloaded filename in the curl command is
misleading: the URL fetches resnet101-v2-7.onnx but the saved name and
subsequent commands use resnet101_Opset17.onnx; update the saved filename and
downstream usage to a consistent, accurate name (e.g., use resnet101-v2-7.onnx
in the curl -o and in the python3 set_batch_size.py command) or add a one-line
clarifying comment above the commands explaining that resnet101_Opset17.onnx is
an alias for resnet101-v2-7.onnx so readers know which model variant is being
used.
🧹 Nitpick comments (3)
examples/qdq_placement/set_batch_size.py (3)
46-48: Consider validating that the model has inputs.If the model has no graph inputs, accessing
graph.input[0]will raise anIndexError. While unlikely for typical models, adding a guard improves robustness.🛡️ Proposed defensive check
# Get the input tensor graph = model.graph + if not graph.input: + raise ValueError(f"Model {model_path} has no graph inputs") input_tensor = graph.input[0]
60-64: Output batch dimension assumption may not hold for all models.This code assumes the first dimension of every output is the batch dimension. While true for ResNet50 and most classification models, some models may have scalar outputs or outputs where batch isn't the first dimension. Consider adding a note in the docstring about this assumption, or making output modification opt-in.
78-84: Use the repository's utility functions for saving and checking the model to handle large files consistently.The codebase provides
save_onnx()andcheck_model()utilities inmodelopt/onnx/utils.pythat handle models larger than 2GB by using external data. Replace the standardonnx.save()(line 80) andonnx.checker.check_model()(line 84) with calls tomodelopt.onnx.utils.save_onnx()andmodelopt.onnx.utils.check_model(). While ResNet50 won't encounter this limitation, using the existing utilities ensures consistency across the codebase and prevents issues when the script is applied to larger models.
f36aa99 to
7257a12
Compare
Code Review: Automated QDQ Placement Example - Part 4.1Thanks for this well-structured documentation PR. I have reviewed it in the context of the larger Automated QDQ feature series (Parts 1-4). Here are my findings: 📋 Context: PR Dependency ChainThis PR is Part 4.1 of a multi-part feature. The dependency chain:
✅ Positive Aspects
|
|
Should we rename the examples folder to |
|
Can we also add an example on how to use region inspect for debugging? Please also include any other features that might help the user with debugging. Thanks. |
There was a problem hiding this comment.
Pull request overview
This PR adds documentation and helper utilities for the QDQ (Quantize/Dequantize) placement optimization feature as part of a larger feature rollout (Part 4.1). The changes prepare the example directory and improve logging capabilities for upcoming autotune functionality.
Changes:
- Enhanced ONNX logging with timestamp support for better traceability
- Added a utility script to convert ONNX models from dynamic to fixed batch size
- Provided comprehensive README with usage examples for the QDQ placement optimization feature
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| modelopt/onnx/logging_config.py | Added timestamp to log formatter for improved traceability |
| examples/qdq_placement/set_batch_size.py | New utility script to set fixed batch size on ONNX models for TensorRT benchmarking |
| examples/qdq_placement/README.md | Comprehensive documentation with prerequisites, usage examples, and advanced features for QDQ placement optimization |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
64dfea7 to
71ad7bb
Compare
71ad7bb to
983dc57
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #841 +/- ##
==========================================
+ Coverage 72.10% 72.12% +0.02%
==========================================
Files 209 209
Lines 23628 23628
==========================================
+ Hits 17036 17042 +6
+ Misses 6592 6586 -6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/ok to test 983dc57 |
|
/ok to test 6057717 |
Signed-off-by: Will Guo <willg@nvidia.com>
Signed-off-by: Will Guo <willg@nvidia.com>
Signed-off-by: Will Guo <willg@nvidia.com>
Signed-off-by: Will Guo <willg@nvidia.com>
Signed-off-by: Will Guo <willg@nvidia.com>
Head branch was pushed to by a user without write access
6057717 to
ee48afc
Compare
|
The failed test is workflow test. This test is unstable and should be waived. |
|
/ok to test ee48afc |
What does this PR do?
Type of change: ?
This change implement a simple example to illustrate the usage of Automated QDQ placement tool.
Overview: ?
Usage
Testing
Before your PR is "Ready for review"
Additional Information
Summary by CodeRabbit
New Features
Documentation