fix: proper XGrammar integration for guided decoding#4726
Open
windreamer wants to merge 1 commit into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the PyTorch guided decoding integration to stay compatible with newer xgrammar behavior, focusing on correct vocabulary sizing, termination handling, and stop-token detection.
Changes:
- Expand
vocab_sizetolen(tokenizer)when needed to ensure the guided bitmask covers EOS/special tokens. - Remove
terminate_without_stop_token=Truefromxgr.GrammarMatcherconstruction (stop tokens are now auto-detected by XGrammar). - Add termination checks to avoid calling
fill_next_token_bitmask/accept_tokenon already-terminated matchers.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| lmdeploy/pytorch/engine/guided_process.py | Adjusts XGrammar tokenizer/vocab sizing, matcher construction, and adds termination-aware bitmap filling. |
| lmdeploy/pytorch/engine/logits_process.py | Skips accept_token updates for terminated guided processors during logits processing. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
89c52f8 to
8d61f88
Compare
8d61f88 to
c7f9502
Compare
- Expand vocab_size to len(tokenizer) to include all token IDs (EOS, special tokens) - Remove terminate_without_stop_token parameter (XGrammar auto-detects stop tokens) - Add is_terminated() check before fill_bitmap and accept_token to handle terminated processors This fix addresses issues with XGrammar integration after the xgrammar upgrade, where some models have vocab_size < len(tokenizer), causing EOS tokens to be out of bitmask range. Cherry-picked from commit 2b118c5 on mtp-guided branch.
c7f9502 to
eddfb59
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
Due to xgrammar upgrade, lmdeploy unit tests are failing. This PR fixes the XGrammar integration issues to ensure compatibility with the latest xgrammar version.
Modification
This PR makes the following modifications to properly integrate with XGrammar:
1. vocab_size expansion
vocab_sizetolen(tokenizer)to include all token IDs (EOS, special tokens)vocab_size < len(tokenizer), causing EOS tokens to be out of bitmask range2. Remove terminate_without_stop_token parameter
terminate_without_stop_token=Trueparameter from GrammarMatcher initialization3. Add is_terminated checks
is_terminated()method to GuidedDecodingManageris_terminated()check beforefill_bitmap()to prevent operations on terminated processorsis_terminated()check beforeaccept_token()to safely handle terminated processorsBC-breaking (Optional)
None. This change maintains backward compatibility while fixing integration issues with the latest xgrammar version.
Checklist
Note: This fix is cherry-picked from commit 2b118c5 on the mtp-guided branch, adapted for the main branch.