fix: proper XGrammar integration for guided decoding by windreamer · Pull Request #4726 · InternLM/lmdeploy

windreamer · 2026-07-02T07:09:50Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Due to xgrammar upgrade, lmdeploy unit tests are failing. This PR fixes the XGrammar integration issues to ensure compatibility with the latest xgrammar version.

Modification

This PR makes the following modifications to properly integrate with XGrammar:

1. vocab_size expansion

Expand vocab_size to len(tokenizer) to include all token IDs (EOS, special tokens)
Some models have vocab_size < len(tokenizer), causing EOS tokens to be out of bitmask range
Added logic to detect and expand vocab_size when necessary

2. Remove terminate_without_stop_token parameter

Removed terminate_without_stop_token=True parameter from GrammarMatcher initialization
XGrammar now automatically detects stop tokens from the tokenizer

3. Add is_terminated checks

Added is_terminated() method to GuidedDecodingManager
Added is_terminated() check before fill_bitmap() to prevent operations on terminated processors
Added is_terminated() check before accept_token() to safely handle terminated processors

BC-breaking (Optional)

None. This change maintains backward compatibility while fixing integration issues with the latest xgrammar version.

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues. ✓
The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. (Tests exist in test_grammar.py)
If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
The documentation has been modified accordingly, like docstring or example tutorials.

Note: This fix is cherry-picked from commit 2b118c5 on the mtp-guided branch, adapted for the main branch.

Copilot

Pull request overview

This PR updates the PyTorch guided decoding integration to stay compatible with newer xgrammar behavior, focusing on correct vocabulary sizing, termination handling, and stop-token detection.

Changes:

Expand vocab_size to len(tokenizer) when needed to ensure the guided bitmask covers EOS/special tokens.
Remove terminate_without_stop_token=True from xgr.GrammarMatcher construction (stop tokens are now auto-detected by XGrammar).
Add termination checks to avoid calling fill_next_token_bitmask / accept_token on already-terminated matchers.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
lmdeploy/pytorch/engine/guided_process.py	Adjusts XGrammar tokenizer/vocab sizing, matcher construction, and adds termination-aware bitmap filling.
lmdeploy/pytorch/engine/logits_process.py	Skips `accept_token` updates for terminated guided processors during logits processing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

- Expand vocab_size to len(tokenizer) to include all token IDs (EOS, special tokens) - Remove terminate_without_stop_token parameter (XGrammar auto-detects stop tokens) - Add is_terminated() check before fill_bitmap and accept_token to handle terminated processors This fix addresses issues with XGrammar integration after the xgrammar upgrade, where some models have vocab_size < len(tokenizer), causing EOS tokens to be out of bitmask range. Cherry-picked from commit 2b118c5 on mtp-guided branch.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Copilot AI review requested due to automatic review settings July 2, 2026 07:09

Copilot started reviewing on behalf of windreamer July 2, 2026 07:10 View session

Copilot AI reviewed Jul 2, 2026

View reviewed changes

Comment thread lmdeploy/pytorch/engine/guided_process.py Outdated

windreamer force-pushed the fix-xgrammar-integration branch from 89c52f8 to 8d61f88 Compare July 2, 2026 07:51

windreamer requested a review from Copilot July 2, 2026 08:19

Copilot started reviewing on behalf of windreamer July 2, 2026 08:20 View session

Copilot AI reviewed Jul 2, 2026

View reviewed changes

Comment thread lmdeploy/pytorch/engine/logits_process.py

Comment thread lmdeploy/pytorch/engine/guided_process.py

Comment thread lmdeploy/pytorch/engine/guided_process.py

windreamer force-pushed the fix-xgrammar-integration branch from 8d61f88 to c7f9502 Compare July 2, 2026 08:43

windreamer force-pushed the fix-xgrammar-integration branch from c7f9502 to eddfb59 Compare July 2, 2026 08:46

windreamer requested a review from Copilot July 2, 2026 08:47

Copilot started reviewing on behalf of windreamer July 2, 2026 08:47 View session

Copilot AI reviewed Jul 2, 2026

View reviewed changes

Comment thread lmdeploy/pytorch/engine/logits_process.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: proper XGrammar integration for guided decoding#4726

fix: proper XGrammar integration for guided decoding#4726
windreamer wants to merge 1 commit into
InternLM:mainfrom
windreamer:fix-xgrammar-integration

windreamer commented Jul 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

windreamer commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modification

1. vocab_size expansion

2. Remove terminate_without_stop_token parameter

3. Add is_terminated checks

BC-breaking (Optional)

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

windreamer commented Jul 2, 2026 •

edited

Loading