Add OpenAI Responses-compatible endpoint#4582
Open
CUHKSZzxy wants to merge 8 commits into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds a text-first, OpenAI Responses API–compatible endpoint (POST /v1/responses) to LMDeploy’s OpenAI server, including request normalization (string/messages/instructions/developer role), function tool mapping/tool-choice validation, and an SSE streaming event surface. It also updates middleware route protection, integrates the new router into api_server, and adds tests + documentation (including Codex integration docs).
Changes:
- Add
lmdeploy/serve/openai/responses.pyimplementingPOST /v1/responses(non-stream + SSE streaming) and related request/response models. - Wire the new endpoint into the OpenAI API server and protect it under engine-sleep middleware.
- Add focused unit tests plus English/Chinese documentation and integration guides (Codex / Claude Code).
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_lmdeploy/serve/openai/test_responses.py | Adds unit coverage for input normalization, tools/tool_choice validation, response shapes, and SSE event shapes. |
| lmdeploy/serve/utils/server_utils.py | Adds /v1/responses to sleeping-engine protected inference routes. |
| lmdeploy/serve/openai/responses.py | Implements the Responses-compatible router, request parsing, tool conversion, non-stream response construction, and SSE streaming events. |
| lmdeploy/serve/openai/api_server.py | Registers the new Responses router on the FastAPI app. |
| docs/zh_cn/llm/api_server.md | Links to the new Responses endpoint documentation. |
| docs/zh_cn/llm/api_server_responses.md | Documents the /v1/responses endpoint (Text V1 subset), tools, SSE events, and Codex setup notes. |
| docs/zh_cn/index.rst | Adds the Responses doc page to the Chinese toctree. |
| docs/en/llm/api_server.md | Links to the new Responses endpoint documentation. |
| docs/en/llm/api_server_responses.md | Documents the /v1/responses endpoint and points to Codex integration docs. |
| docs/en/integration/codex.md | Adds a Codex → LMDeploy /v1/responses integration guide. |
| docs/en/integration/claude_code.md | Adds a Claude Code → LMDeploy /v1/messages integration guide. |
| docs/en/index.rst | Adds the Responses doc page and a new Integrations toctree (Codex/Claude Code). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
POST /v1/responsesendpoint.Validation
pytest tests/test_lmdeploy/serve/openai/test_responses.py -q(18 passed)git diff --check upstream/main...HEADCodex Demo