[Minor] Improve local hessian and mse calibration by Fridah-nv · Pull Request #976 · NVIDIA/Model-Optimizer

Fridah-nv · 2026-03-04T22:43:30Z

What does this PR do?

Type of change: ? Bug fix

For local-hessian calibration, use QDQed activation for collecting information
Replace MSECalibrator into MAXCalibrator at the end of MSE and Local-hessian calibration so that they can be chained with GPTQ.
Update doctring to reflect the fact that MSE only run on weight quantizers

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, using torch.load(..., weights_only=True), avoiding pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other source, did you follow IP policy in CONTRIBUTING.md?: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Summary by CodeRabbit

Refactor
- Restructured quantizer calibration to apply MSE-based methods exclusively to weights, with activation quantizers consistently using max-calibration values.
Documentation
- Updated configuration documentation to clarify how different calibration methods affect specific quantizer types.

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>

coderabbitai · 2026-03-04T22:43:46Z

📝 Walkthrough

Walkthrough

The changes update quantization calibration behavior to apply MSE calibration exclusively to weight quantizers while activation quantizers retain max-calibration amax values. A new utility function replaces MSE calibrators with max calibrators after calibration steps, and docstrings clarify this weight-only calibration scope across MSE and Hessian calibration configurations.

Changes

Cohort / File(s)	Summary
Configuration Documentation `modelopt/torch/quantization/config.py`	Updated docstrings for `MseCalibConfig` and `LocalHessianCalibConfig` to specify that calibration applies only to weight quantizers, with activation quantizers retaining max-calibration amax values. Clarified reconstruction error formulas and algorithm scope.
Calibration Logic & Utilities `modelopt/torch/quantization/model_calib.py`	Added `MaxCalibrator` import, introduced `replace_mse_calibrators_with_max()` utility function to traverse and replace `MseCalibrator`/`NVFP4MSECalibrator` instances with `MaxCalibrator`. Modified `mse_calibrate()` and `local_hessian_calibrate()` control flow to invoke the replacement function post-calibration steps, ensuring weight-only MSE optimization with preserved activation max values.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error)

Check name	Status	Explanation	Resolution
Security Anti-Patterns	❌ Error	The code calls torch.load() without explicitly setting weights_only=True or including a justifying comment for weights_only=False, violating SECURITY.md requirements.	Change to torch.load(path, map_location="cpu", weights_only=True) since the hessian state contains only tensor data.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main changes in the pull request—improvements to local Hessian and MSE calibration logic, including better docstrings and calibrator replacement behavior.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fridah/improve-local-hessian

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelopt/torch/quantization/model_calib.py`:
- Around line 580-615: The captured_quantized[0] may still be None if
capture_forward is not invoked during _forward_no_local_hessian; add a defensive
guard before calling self.hessian_helper.accumulate_hessian: check
captured_quantized[0] is not None and only then call
accumulate_hessian(captured_quantized[0]); if it is None fall back to the raw
input path (use input.to_local() if available) and then restore
self.input_quantizer.forward to original_forward if set; update the block that
follows out = self._forward_no_local_hessian(...) to perform this None-check and
fallback while preserving restoration of the original_forward.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c0dbb264-f19a-4f5d-a597-2d4860ebdde3

📥 Commits

Reviewing files that changed from the base of the PR and between e8f9687 and 9a51aa7.

📒 Files selected for processing (2)

modelopt/torch/quantization/config.py
modelopt/torch/quantization/model_calib.py

coderabbitai · 2026-03-04T22:48:10Z

modelopt/torch/quantization/model_calib.py

+        # Forward without weight quantization during caching
        if LocalHessianHelper.cache_mode:
            self.weight_quantizer.disable()
+
+            # Capture quantized input from the forward pass for Hessian collection
+            captured_quantized = [None]
+            original_forward = None
+            if (
+                self.hessian_helper.is_enabled
+                and hasattr(self, "input_quantizer")
+                and self.input_quantizer.is_enabled
+            ):
+                original_forward = self.input_quantizer.forward
+
+                def capture_forward(input_tensor):
+                    quantized = original_forward(input_tensor)
+                    captured_quantized[0] = (
+                        quantized.to_local() if hasattr(quantized, "to_local") else quantized
+                    )
+                    return quantized
+
+                self.input_quantizer.forward = capture_forward
+
            out = self._forward_no_local_hessian(input, *args, **kwargs)
+
+            # Collect Hessian from the quantized input that was used in forward pass
+            if self.hessian_helper.is_enabled:
+                if hasattr(self, "input_quantizer") and self.input_quantizer.is_enabled:
+                    self.hessian_helper.accumulate_hessian(captured_quantized[0])
+                    if original_forward is not None:
+                        self.input_quantizer.forward = original_forward
+                else:
+                    # No input_quantizer, use raw input
+                    input_local = input.to_local() if hasattr(input, "to_local") else input
+                    self.hessian_helper.accumulate_hessian(input_local)
+


⚠️ Potential issue | 🟡 Minor

Consider adding a guard for captured_quantized[0] being None.

The monkey-patching approach to capture quantized inputs is sound and correctly restores the original forward method. However, if capture_forward is never invoked during _forward_no_local_hessian (an unlikely edge case), captured_quantized[0] would remain None, causing accumulate_hessian to fail on input_tensor.reshape().

Consider adding a defensive check:

🛡️ Proposed defensive guard

# Collect Hessian from the quantized input that was used in forward pass if self.hessian_helper.is_enabled: if hasattr(self, "input_quantizer") and self.input_quantizer.is_enabled: - self.hessian_helper.accumulate_hessian(captured_quantized[0]) + if captured_quantized[0] is not None: + self.hessian_helper.accumulate_hessian(captured_quantized[0]) if original_forward is not None: self.input_quantizer.forward = original_forward else:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/quantization/model_calib.py` around lines 580 - 615, The captured_quantized[0] may still be None if capture_forward is not invoked during _forward_no_local_hessian; add a defensive guard before calling self.hessian_helper.accumulate_hessian: check captured_quantized[0] is not None and only then call accumulate_hessian(captured_quantized[0]); if it is None fall back to the raw input path (use input.to_local() if available) and then restore self.input_quantizer.forward to original_forward if set; update the block that follows out = self._forward_no_local_hessian(...) to perform this None-check and fallback while preserving restoration of the original_forward.

codecov · 2026-03-04T22:58:09Z

Codecov Report

❌ Patch coverage is 39.28571% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.09%. Comparing base (e8f9687) to head (9a51aa7).

Files with missing lines	Patch %	Lines
modelopt/torch/quantization/model_calib.py	39.28%	17 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #976      +/-   ##
==========================================
- Coverage   72.12%   72.09%   -0.04%     
==========================================
  Files         209      209              
  Lines       23628    23652      +24     
==========================================
+ Hits        17042    17052      +10     
- Misses       6586     6600      +14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Fridah-nv added 2 commits March 4, 2026 22:38

use quantized input for hessian collection

0a05012

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>

update docstring and replace MSECalibrator with MAXCalibrator

9a51aa7

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>

Fridah-nv requested review from realAsma and sugunav14 March 4, 2026 22:43

Fridah-nv self-assigned this Mar 4, 2026

Fridah-nv requested a review from a team as a code owner March 4, 2026 22:43

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Minor] Improve local hessian and mse calibration#976

[Minor] Improve local hessian and mse calibration#976
Fridah-nv wants to merge 2 commits intomainfrom
fridah/improve-local-hessian

Fridah-nv commented Mar 4, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Pre-merge checks failed

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 4, 2026

Uh oh!

codecov bot commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Fridah-nv commented Mar 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks failed

❌ Failed checks (1 error)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 4, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fridah-nv commented Mar 4, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 4, 2026 •

edited

Loading