Skip to content

[Minor] Improve local hessian and mse calibration#976

Open
Fridah-nv wants to merge 2 commits intomainfrom
fridah/improve-local-hessian
Open

[Minor] Improve local hessian and mse calibration#976
Fridah-nv wants to merge 2 commits intomainfrom
fridah/improve-local-hessian

Conversation

@Fridah-nv
Copy link
Contributor

@Fridah-nv Fridah-nv commented Mar 4, 2026

What does this PR do?

Type of change: ? Bug fix

  • For local-hessian calibration, use QDQed activation for collecting information
  • Replace MSECalibrator into MAXCalibrator at the end of MSE and Local-hessian calibration so that they can be chained with GPTQ.
  • Update doctring to reflect the fact that MSE only run on weight quantizers

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, using torch.load(..., weights_only=True), avoiding pickle, etc.).

  • Is this change backward compatible?: ✅ / ❌ / N/A
  • If you copied code from any other source, did you follow IP policy in CONTRIBUTING.md?: ✅ / ❌ / N/A
  • Did you write any new necessary tests?: ✅ / ❌ / N/A
  • Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Summary by CodeRabbit

  • Refactor

    • Restructured quantizer calibration to apply MSE-based methods exclusively to weights, with activation quantizers consistently using max-calibration values.
  • Documentation

    • Updated configuration documentation to clarify how different calibration methods affect specific quantizer types.

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
@Fridah-nv Fridah-nv requested review from realAsma and sugunav14 March 4, 2026 22:43
@Fridah-nv Fridah-nv self-assigned this Mar 4, 2026
@Fridah-nv Fridah-nv requested a review from a team as a code owner March 4, 2026 22:43
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 4, 2026

📝 Walkthrough

Walkthrough

The changes update quantization calibration behavior to apply MSE calibration exclusively to weight quantizers while activation quantizers retain max-calibration amax values. A new utility function replaces MSE calibrators with max calibrators after calibration steps, and docstrings clarify this weight-only calibration scope across MSE and Hessian calibration configurations.

Changes

Cohort / File(s) Summary
Configuration Documentation
modelopt/torch/quantization/config.py
Updated docstrings for MseCalibConfig and LocalHessianCalibConfig to specify that calibration applies only to weight quantizers, with activation quantizers retaining max-calibration amax values. Clarified reconstruction error formulas and algorithm scope.
Calibration Logic & Utilities
modelopt/torch/quantization/model_calib.py
Added MaxCalibrator import, introduced replace_mse_calibrators_with_max() utility function to traverse and replace MseCalibrator/NVFP4MSECalibrator instances with MaxCalibrator. Modified mse_calibrate() and local_hessian_calibrate() control flow to invoke the replacement function post-calibration steps, ensuring weight-only MSE optimization with preserved activation max values.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error)

Check name Status Explanation Resolution
Security Anti-Patterns ❌ Error The code calls torch.load() without explicitly setting weights_only=True or including a justifying comment for weights_only=False, violating SECURITY.md requirements. Change to torch.load(path, map_location="cpu", weights_only=True) since the hessian state contains only tensor data.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes in the pull request—improvements to local Hessian and MSE calibration logic, including better docstrings and calibrator replacement behavior.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fridah/improve-local-hessian

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelopt/torch/quantization/model_calib.py`:
- Around line 580-615: The captured_quantized[0] may still be None if
capture_forward is not invoked during _forward_no_local_hessian; add a defensive
guard before calling self.hessian_helper.accumulate_hessian: check
captured_quantized[0] is not None and only then call
accumulate_hessian(captured_quantized[0]); if it is None fall back to the raw
input path (use input.to_local() if available) and then restore
self.input_quantizer.forward to original_forward if set; update the block that
follows out = self._forward_no_local_hessian(...) to perform this None-check and
fallback while preserving restoration of the original_forward.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c0dbb264-f19a-4f5d-a597-2d4860ebdde3

📥 Commits

Reviewing files that changed from the base of the PR and between e8f9687 and 9a51aa7.

📒 Files selected for processing (2)
  • modelopt/torch/quantization/config.py
  • modelopt/torch/quantization/model_calib.py

Comment on lines +580 to +615
# Forward without weight quantization during caching
if LocalHessianHelper.cache_mode:
self.weight_quantizer.disable()

# Capture quantized input from the forward pass for Hessian collection
captured_quantized = [None]
original_forward = None
if (
self.hessian_helper.is_enabled
and hasattr(self, "input_quantizer")
and self.input_quantizer.is_enabled
):
original_forward = self.input_quantizer.forward

def capture_forward(input_tensor):
quantized = original_forward(input_tensor)
captured_quantized[0] = (
quantized.to_local() if hasattr(quantized, "to_local") else quantized
)
return quantized

self.input_quantizer.forward = capture_forward

out = self._forward_no_local_hessian(input, *args, **kwargs)

# Collect Hessian from the quantized input that was used in forward pass
if self.hessian_helper.is_enabled:
if hasattr(self, "input_quantizer") and self.input_quantizer.is_enabled:
self.hessian_helper.accumulate_hessian(captured_quantized[0])
if original_forward is not None:
self.input_quantizer.forward = original_forward
else:
# No input_quantizer, use raw input
input_local = input.to_local() if hasattr(input, "to_local") else input
self.hessian_helper.accumulate_hessian(input_local)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider adding a guard for captured_quantized[0] being None.

The monkey-patching approach to capture quantized inputs is sound and correctly restores the original forward method. However, if capture_forward is never invoked during _forward_no_local_hessian (an unlikely edge case), captured_quantized[0] would remain None, causing accumulate_hessian to fail on input_tensor.reshape().

Consider adding a defensive check:

🛡️ Proposed defensive guard
         # Collect Hessian from the quantized input that was used in forward pass
         if self.hessian_helper.is_enabled:
             if hasattr(self, "input_quantizer") and self.input_quantizer.is_enabled:
-                self.hessian_helper.accumulate_hessian(captured_quantized[0])
+                if captured_quantized[0] is not None:
+                    self.hessian_helper.accumulate_hessian(captured_quantized[0])
                 if original_forward is not None:
                     self.input_quantizer.forward = original_forward
             else:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/quantization/model_calib.py` around lines 580 - 615, The
captured_quantized[0] may still be None if capture_forward is not invoked during
_forward_no_local_hessian; add a defensive guard before calling
self.hessian_helper.accumulate_hessian: check captured_quantized[0] is not None
and only then call accumulate_hessian(captured_quantized[0]); if it is None fall
back to the raw input path (use input.to_local() if available) and then restore
self.input_quantizer.forward to original_forward if set; update the block that
follows out = self._forward_no_local_hessian(...) to perform this None-check and
fallback while preserving restoration of the original_forward.

@codecov
Copy link

codecov bot commented Mar 4, 2026

Codecov Report

❌ Patch coverage is 39.28571% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.09%. Comparing base (e8f9687) to head (9a51aa7).

Files with missing lines Patch % Lines
modelopt/torch/quantization/model_calib.py 39.28% 17 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #976      +/-   ##
==========================================
- Coverage   72.12%   72.09%   -0.04%     
==========================================
  Files         209      209              
  Lines       23628    23652      +24     
==========================================
+ Hits        17042    17052      +10     
- Misses       6586     6600      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant