Glm5 support by Aphoh · Pull Request #985 · NVIDIA/Model-Optimizer

Aphoh · 2026-03-05T20:58:57Z

What does this PR do?

Type of change: ?

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, using torch.load(..., weights_only=True), avoiding pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other source, did you follow IP policy in CONTRIBUTING.md?: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

- Make DeepSeek-specific imports lazy (only loaded for --model_type deepseek) - Add load_hf_model() using device_map="auto" for single-node multi-GPU - Add dynamic MoE class discovery and registration for calibration - Add HF layer name patterns for quant config (q_a_proj, kv_a_proj, etc.) - Disable GLM-5 indexer/MTP layers from quantization - Make dist calls conditional for non-distributed HF path - Add --model_type flag to ptq.py, quantize_to_nvfp4.py, and shell script - Skip key remapping in quantize_to_nvfp4.py for HF models - Guard quantization_config removal for bf16 checkpoints - Add run_glm5_ptq.sh launch script Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix _remap_key to use component-level matching, add kernel.py stubs, MTP head extraction script, and GLM-5 documentation.

- Fix TokenizersBackend compatibility in dataset_utils.py: transformers 5.x TokenizersBackend lacks batch_encode_plus, added fallback using _encode_plus per-sample with manual padding support (left/right) - Fix quantize_to_nvfp4.py: skip quantization for layers not listed in per_layer_quant_config when using MIXED_PRECISION mode - Add GLM MLA and DSA Indexer exclusions to hf_ptq build_quant_cfg - Improve run_glm5_ptq.sh CLI: add --amax-path and --mla-quant flags - Add glm5 dequant_nvfp4.py utility

copy-pr-bot · 2026-03-05T20:59:01Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-03-05T20:59:11Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 00656bee-d24b-462b-a5cb-59366bfb35ad

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Aphoh and others added 3 commits February 11, 2026 15:22

Add NVFP4 quantization pipeline for GLM-5 via DeepSeek V3.2 code path

418ee37

Fix _remap_key to use component-level matching, add kernel.py stubs, MTP head extraction script, and GLM-5 documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Glm5 support#985

Glm5 support#985
Aphoh wants to merge 3 commits intoNVIDIA:mainfrom
Aphoh:glm5-support

Aphoh commented Mar 5, 2026

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Aphoh commented Mar 5, 2026

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant