Conversation
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
📝 WalkthroughWalkthroughTwo functions adjusted for improved checkpoint handling: Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
There was a problem hiding this comment.
🧹 Nitpick comments (2)
modelopt/torch/quantization/model_calib.py (2)
1658-1669:cache.reset()may be redundant after forcingpast_key_values=None; add justification/defensive handling.In the replay loop you:
- copy
kwargs_input- call
cache.reset()(if present)- then set
kwargs_input["past_key_values"] = NoneSince the cache object is not passed into the layer after setting to
None,reset()is only useful if you’re relying on it for buffer cleanup/releasing resources before dropping the reference. If that’s the intent, a brief comment would help future readers.If
reset()can throw for some cache implementations, consider guarding it (minor hardening) so calibration doesn’t fail unexpectedly on a cache type that exposesresetbut can’t safely run it:Optional hardening diff
cache = kwargs_input["past_key_values"] - if cache is not None and hasattr(cache, "reset"): - cache.reset() + if cache is not None and hasattr(cache, "reset"): + try: + cache.reset() + except Exception: + # Best-effort cleanup; we will force past_key_values=None anyway. + pass kwargs_input["past_key_values"] = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/quantization/model_calib.py` around lines 1658 - 1669, The loop clears past_key_values by setting kwargs_input["past_key_values"]=None but also calls cache.reset() which is either redundant or risky; either remove the reset() call and add a short comment noting we drop the cache by setting past_key_values to None, or harden it by wrapping cache.reset() in a try/except (logging or ignoring exceptions) and add a comment explaining it's defensive cleanup before dropping the reference; update the replay loop around _inputs, kwargs_input, cache, and the call site m(*args, **kwargs_input) accordingly so behavior is explicit and safe.
1646-1653: OK to skip bootstrapping when no layers remain, but consider an early-exit optimization.With
start_layer >= num_layers, you setlayer_inputs = Noneand the per-layer loop becomes a no-op. That’s logically consistent.Optional: you could short-circuit earlier (after patching/unpatching strategy review) to avoid unnecessary
input_getter._patch_all_layers(...)work when there’s nothing to replay/capture. Not required for correctness, but it can reduce overhead for fully-calibrated restarts.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/quantization/model_calib.py` around lines 1646 - 1653, The per-layer loop is a no-op when start_layer >= num_layers but you still call input_getter._patch_all_layers(...) and do bootstrapping work; add an early-exit to skip bootstrapping and patch/unpatch steps when start_layer >= num_layers to avoid unnecessary overhead—check the start_layer >= num_layers condition before calling input_getter._patch_all_layers (and before any bootstrapping/patching logic) and return or bypass the rest of the replay/capture flow so layer_inputs/resumed_inputs/forward_loop work is skipped for fully-calibrated restarts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@modelopt/torch/quantization/model_calib.py`:
- Around line 1658-1669: The loop clears past_key_values by setting
kwargs_input["past_key_values"]=None but also calls cache.reset() which is
either redundant or risky; either remove the reset() call and add a short
comment noting we drop the cache by setting past_key_values to None, or harden
it by wrapping cache.reset() in a try/except (logging or ignoring exceptions)
and add a comment explaining it's defensive cleanup before dropping the
reference; update the replay loop around _inputs, kwargs_input, cache, and the
call site m(*args, **kwargs_input) accordingly so behavior is explicit and safe.
- Around line 1646-1653: The per-layer loop is a no-op when start_layer >=
num_layers but you still call input_getter._patch_all_layers(...) and do
bootstrapping work; add an early-exit to skip bootstrapping and patch/unpatch
steps when start_layer >= num_layers to avoid unnecessary overhead—check the
start_layer >= num_layers condition before calling
input_getter._patch_all_layers (and before any bootstrapping/patching logic) and
return or bypass the rest of the replay/capture flow so
layer_inputs/resumed_inputs/forward_loop work is skipped for fully-calibrated
restarts.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f912c3b9-748a-4790-92cf-c62b7dc38086
📒 Files selected for processing (2)
examples/llm_ptq/example_utils.pymodelopt/torch/quantization/model_calib.py
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1344 +/- ##
==========================================
- Coverage 75.40% 73.36% -2.04%
==========================================
Files 464 485 +21
Lines 50036 53567 +3531
==========================================
+ Hits 37729 39301 +1572
- Misses 12307 14266 +1959
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Type of change: ?
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information
Summary by CodeRabbit
Release Notes