fix(openai_api_server): use exclude_none for completion stream finish chunks#3872
Open
Chessing234 wants to merge 1 commit into
Open
Conversation
… chunks generate_completion_stream_generator's finish loop carried over the same '...so exclude_none to exclude field "content"' comment as the chat variant but called model_dump_json(exclude_unset=True). The chat counterpart (chat_completion_stream_generator) correctly uses exclude_none=True. Switch the completion path to exclude_none so the final-chunk None fields (e.g. logprobs) are stripped as the comment and the parallel chat-stream path intend.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
`generate_completion_stream_generator` (`fastchat/serve/openai_api_server.py`) emits the final SSE chunks for a completion stream with `model_dump_json(exclude_unset=True)`, leaving `None` fields (e.g. `logprobs`) in the payload that the preceding comment explicitly says should be stripped.
Root cause
The function ends with:
```python
There is not "content" field in the last delta message, so exclude_none to exclude field "content".
for finish_chunk in finish_stream_events:
yield f"data: {finish_chunk.model_dump_json(exclude_unset=True)}\n\n"
```
The comment and `exclude_*` keyword disagree. The parallel chat-completion path `chat_completion_stream_generator` carries the identical comment immediately above and correctly uses `exclude_none=True`:
```python
There is not "content" field in the last delta message, so exclude_none to exclude field "content".
for finish_chunk in finish_stream_events:
yield f"data: {finish_chunk.model_dump_json(exclude_none=True)}\n\n"
```
This is a copy-paste asymmetry: the completion-stream version kept `exclude_unset` while the chat-stream version was corrected to `exclude_none`.
Fix
Switch the completion-stream finish loop to `exclude_none=True` so it matches its comment and its chat-stream counterpart. One-line change; per-chunk emission inside the loop body (which legitimately uses `exclude_unset`) is untouched.