Sequential calibrate refactor by sugunav14 · Pull Request #982 · NVIDIA/Model-Optimizer

sugunav14 · 2026-03-05T19:08:05Z

What does this PR do?

Type of change: New feature

The current sequential calibration support has O(N^2) complexity for collecting updated activations for a decoder layer. To solve this, we adopted a modular/plugin based approach which involves hooks to capture the updated activations by running forward on the previous decoder layer using cached prev layer activations. This leads to an issue with nested modules i.e. the logic in the parent module might need to be replicated in the lower level modules to ensure equivalence. For example, in the nemotron model, the parent module NemotronHModel has logic to create and select appropriate mask based on the decoder layer type (mamba vs attention).

This PR implements a more generic solution for sequential calibration, by choosing to collect activations using model forward, thereby ensuring that all the parent module logic is preserved. We use an attribute "state"on the modules to indicate whether to perform recomputation/skip the layer while running module forward. This can help us avoid redundant computations for getting updated activations.

The overall flow is as follows

The user must register a get_decoder_layers() function that returns a list of layers to be calibrated sequentially
LayerActivationCollector, goes through the list of layers and patches module forward with a "state aware" module forward
When model.forward() is called, all the parent logic is recomputed as expected (embeddings, residual connections, generating attention mask etc).
Lets say we are currently calibrating layer N and we want to get updated activations; we set layer N to capture and layer N-1 to run (because this layer was processed previously and updated activations need to be generated). Already processed layers are set to skip. When model.forward() is called, all the previous decoder layer computations are skipped. Layer N-1 uses the cached inputs to generate new activations. Layer N inputs are captured using the same logic as before and cached so that they can be used to get updated activations for Layer N+1.

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, using torch.load(..., weights_only=True), avoiding pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other source, did you follow IP policy in CONTRIBUTING.md?: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

copy-pr-bot · 2026-03-05T19:08:09Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-03-05T19:09:20Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8a34c9cc-e806-47b8-b270-847ea081a63c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch svelury/sequential-calibrate-refactor

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

Signed-off-by: realAsma <akuriparambi@nvidia.com>

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

sugunav14 and others added 6 commits March 5, 2026 21:20

sequential flow

fbbf71f

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

clean up

92d227e

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

Modular/Plugin based sequential calib

f1cd26f

Signed-off-by: realAsma <akuriparambi@nvidia.com>

update

4e4d7c3

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

initial e2e tested sequential calibrate refactor

d822d92

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

added meta data caching

7f72422

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

sugunav14 force-pushed the svelury/sequential-calibrate-refactor branch from 13b9033 to 7f72422 Compare March 5, 2026 21:30

sugunav14 added 2 commits March 5, 2026 22:57

added logging and unit tests

2f72d6d

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

removed stray changes

96ccdad

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sequential calibrate refactor#982

Sequential calibrate refactor#982
sugunav14 wants to merge 8 commits intomainfrom
svelury/sequential-calibrate-refactor

sugunav14 commented Mar 5, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026 •

edited

Loading

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sugunav14 commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sugunav14 commented Mar 5, 2026 •

edited

Loading

coderabbitai bot commented Mar 5, 2026 •

edited

Loading