fix: layerwise calibration backward-compat, recipe split, batch-size guard#1310
fix: layerwise calibration backward-compat, recipe split, batch-size guard#1310
Conversation
📝 WalkthroughWalkthroughAdds a Pydantic validation alias so legacy Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1310 +/- ##
==========================================
- Coverage 75.69% 75.58% -0.11%
==========================================
Files 467 471 +4
Lines 50334 51052 +718
==========================================
+ Hits 38099 38590 +491
- Misses 12235 12462 +227
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelopt/torch/quantization/config.py`:
- Around line 1246-1250: The migration method _migrate_use_sequential currently
treats explicit layerwise=False as unset by checking "not self.layerwise", which
causes use_sequential to override an explicitly provided False; change the check
to detect whether the field was explicitly set by using '"layerwise" not in
self.model_fields_set' so you only copy use_sequential into layerwise when
layerwise was not provided by the caller, preserving explicit layerwise=False
values.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 976f04a8-a415-4ce0-9f53-7ef5c5686d8a
📒 Files selected for processing (2)
modelopt/torch/quantization/config.pytests/unit/torch/quantization/test_config_validation.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/unit/torch/quantization/test_config_validation.py
3e9db5c to
ad91b29
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/llm_ptq/hf_ptq.py`:
- Around line 967-968: The current is_layerwise check only considers dict
recipes and misses algorithms expressed as lists (QuantizeAlgoCfgType); update
the detection for recipe_algorithm (from
recipe.quantize.model_dump().get("algorithm")) so it treats both dicts and
lists: if it's a dict keep the existing .get("layerwise", False) check, and if
it's a list, scan the list for any dict item where item.get("layerwise", False)
is True (set is_layerwise True if any match). Ensure you reference
recipe_algorithm and is_layerwise when making this change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f0df803d-ca1c-4640-90ed-c3c5230f9faf
📒 Files selected for processing (5)
examples/llm_ptq/hf_ptq.pymodelopt/torch/quantization/config.pymodelopt_recipes/general/ptq/nvfp4_experts_only-fp8_kv.yamlmodelopt_recipes/general/ptq/nvfp4_experts_only-fp8_kv_layerwise.yamltests/unit/torch/quantization/test_config_validation.py
✅ Files skipped from review due to trivial changes (1)
- tests/unit/torch/quantization/test_config_validation.py
| ) | ||
|
|
||
| recipe_algorithm = recipe.quantize.model_dump().get("algorithm") if recipe else None | ||
| is_layerwise = isinstance(recipe_algorithm, dict) and recipe_algorithm.get("layerwise", False) |
There was a problem hiding this comment.
you can check is layerwise before converting the recipe to a dict, when it is still a pydantic config via recipe.algorithm.layerwise
There was a problem hiding this comment.
Agreed, we do not need to do model_dump, unless have to.
We have places do model_dump just to adapt to the existing codebase.
There was a problem hiding this comment.
Addressed in f78ac50 \u2014 the helper now reads recipe.quantize.algorithm directly as a pydantic config (via a ModelOptPTQRecipe branch that recurses into it), no more .model_dump() round-trip. See examples/llm_ptq/hf_ptq.py:967-974.
cjluo-nv
left a comment
There was a problem hiding this comment.
Overall a clean, well-motivated PR with good tests. Two issues worth addressing:
-
Copyright year: The new
nvfp4_experts_only-fp8_kv_layerwise.yamlusesCopyright (c) 2024but the project'sLICENSE_HEADERsays2026. Should use the current year. -
Bare
assertfor runtime validation: The recipe type check inhf_ptq.pyusesassert isinstance(recipe, ModelOptPTQRecipe). Sinceassertis stripped under-O, this should be a properValueError/TypeErrorraise.
| @@ -0,0 +1,94 @@ | |||
| # SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |||
There was a problem hiding this comment.
The project's LICENSE_HEADER specifies Copyright (c) 2026. This new file has Copyright (c) 2024. Please update to match the canonical header.
There was a problem hiding this comment.
Addressed in f78ac50 \u2014 bumped the header year to 2026 to match LICENSE_HEADER.
…guard - config: accept legacy `use_sequential` via AliasChoices on `layerwise` so pre-#1251 PTQ checkpoints load; still serializes as `layerwise` - recipes: split nvfp4_experts_only-fp8_kv into default (no layerwise) and _layerwise variants - hf_ptq: auto batch-size detection not supported with layerwise; default to batch_size=1 in that case - tests: cover alias accept, current-name accept, dump under current name, and extra='forbid' still rejecting unknowns Signed-off-by: realAsma <akuriparambi@nvidia.com>
…validation - examples/llm_ptq/hf_ptq.py: replace dict-inspection layerwise detection with a small recursive helper accepting ModelOptPTQRecipe directly, handling list-form QuantizeAlgoCfgType (per coderabbitai, jenchen13). - examples/llm_ptq/hf_ptq.py: convert recipe-type assert to explicit if/raise TypeError so validation is not stripped under python -O (per cjluo-nv). - modelopt_recipes/general/ptq/nvfp4_experts_only-fp8_kv_layerwise.yaml: bump new-file copyright header to 2026 per LICENSE_HEADER (per cjluo-nv). Signed-off-by: realAsma <akuriparambi@nvidia.com>
ad91b29 to
f78ac50
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/llm_ptq/hf_ptq.py`:
- Around line 968-973: The helper _is_layerwise currently treats dict-form
algorithms as non-layerwise because getattr(obj, "layerwise", False) returns
False for dicts; update _is_layerwise to explicitly handle dicts by checking if
obj is a dict and returning True when obj.get("layerwise") is truthy or when any
of its values (or nested algorithm entries) are layerwise (i.e., recurse into
dict values similar to list handling). Keep the existing branches for
ModelOptPTQRecipe and list, and ensure the final fallback checks dicts before
using getattr to avoid misclassifying dict algorithms and bypassing the
layerwise guard.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 7a4ac195-3ed7-4078-ba4d-fadce20d6a46
📒 Files selected for processing (5)
examples/llm_ptq/hf_ptq.pymodelopt/torch/quantization/config.pymodelopt_recipes/general/ptq/nvfp4_experts_only-fp8_kv.yamlmodelopt_recipes/general/ptq/nvfp4_experts_only-fp8_kv_layerwise.yamltests/unit/torch/quantization/test_config_validation.py
🚧 Files skipped from review as they are similar to previous changes (2)
- modelopt_recipes/general/ptq/nvfp4_experts_only-fp8_kv.yaml
- modelopt_recipes/general/ptq/nvfp4_experts_only-fp8_kv_layerwise.yaml
| def _is_layerwise(obj): | ||
| if isinstance(obj, ModelOptPTQRecipe): | ||
| return _is_layerwise(obj.quantize.algorithm) | ||
| if isinstance(obj, list): | ||
| return any(_is_layerwise(a) for a in obj) | ||
| return bool(getattr(obj, "layerwise", False)) |
There was a problem hiding this comment.
Handle dict-form algorithms in _is_layerwise.
At Line 973, getattr(obj, "layerwise", False) makes dict algorithms evaluate as non-layerwise. That can bypass the Line 990-994 guard and fall back to full-model batch probing.
Suggested fix
def _is_layerwise(obj):
if isinstance(obj, ModelOptPTQRecipe):
return _is_layerwise(obj.quantize.algorithm)
+ if isinstance(obj, dict):
+ if "layerwise" in obj:
+ return bool(obj["layerwise"])
+ if "algorithm" in obj:
+ return _is_layerwise(obj["algorithm"])
+ return False
if isinstance(obj, list):
return any(_is_layerwise(a) for a in obj)
return bool(getattr(obj, "layerwise", False))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/llm_ptq/hf_ptq.py` around lines 968 - 973, The helper _is_layerwise
currently treats dict-form algorithms as non-layerwise because getattr(obj,
"layerwise", False) returns False for dicts; update _is_layerwise to explicitly
handle dicts by checking if obj is a dict and returning True when
obj.get("layerwise") is truthy or when any of its values (or nested algorithm
entries) are layerwise (i.e., recurse into dict values similar to list
handling). Keep the existing branches for ModelOptPTQRecipe and list, and ensure
the final fallback checks dicts before using getattr to avoid misclassifying
dict algorithms and bypassing the layerwise guard.
| @@ -0,0 +1,94 @@ | |||
| # SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |||
There was a problem hiding this comment.
Okay for now. Can you compose this yaml based on the modelopt_recipes/general/ptq/nvfp4_experts_only-fp8_kv.yaml once the composable recipes PR is merged?
Summary
Follow-up to #1251 (which renamed
use_sequential→layerwise). Three related fixes bundled:Backward-compatible config loading. PTQ checkpoints saved before Add layerwise calibration for large models #1251 store the
legacy
use_sequentialkey in the calibration-algorithm config, so loading them nowraises
ValidationError: Extra inputs are not permitted (use_sequential)becauseQuantizeAlgorithmConfigusesextra='forbid'. Acceptuse_sequentialas an aliasfor
layerwiseviaAliasChoices. The field still serializes aslayerwise, soround-trips through the current schema are clean.
Recipe split.
nvfp4_experts_only-fp8_kvpreviously enabled layerwise calibrationby default, which changes the calibration flow materially. Split into two recipes:
nvfp4_experts_only-fp8_kv.yaml— default (no layerwise)nvfp4_experts_only-fp8_kv_layerwise.yaml— layerwise varianthf_ptqbatch-size guard. Auto batch-size detection is not supported togetherwith layerwise calibration. Default to
batch_size=1when layerwise is enabled andthe user hasn't set a batch size explicitly.
Originally reported by Jenny Chen while resuming a PTQ checkpoint via
restore_sharded_modelopt_state:Test plan
tests/unit/torch/quantization/test_config_validation.py— legacy alias accepted, current name accepted, dump serializes under current name,extra='forbid'still rejects unknown keys.pre-commit run— clean.Before your PR is Ready for review
Summary by CodeRabbit
New Features
Improvements
use_sequential) in quantization configurations.