Skip to content

Add OpenAI Responses-compatible endpoint#4582

Open
CUHKSZzxy wants to merge 8 commits into
InternLM:mainfrom
CUHKSZzxy:feat/responses-api-text-v1
Open

Add OpenAI Responses-compatible endpoint#4582
CUHKSZzxy wants to merge 8 commits into
InternLM:mainfrom
CUHKSZzxy:feat/responses-api-text-v1

Conversation

@CUHKSZzxy
Copy link
Copy Markdown
Collaborator

@CUHKSZzxy CUHKSZzxy commented May 13, 2026

Summary

  • Add a text-first OpenAI Responses-compatible POST /v1/responses endpoint.
  • Support string/message input, instructions/developer-role normalization, function tools, tool choice validation, and Responses SSE events.
  • Add focused tests, Responses API docs, and Codex integration docs.

Validation

  • pytest tests/test_lmdeploy/serve/openai/test_responses.py -q (18 passed)
  • git diff --check upstream/main...HEAD
  • Local Codex smoke tests against LMDeploy for no-tool, read, edit, multi-step, and project workflows.

Codex Demo

codex_sample

@CUHKSZzxy CUHKSZzxy marked this pull request as ready for review May 13, 2026 03:30
Copilot AI review requested due to automatic review settings May 13, 2026 03:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a text-first, OpenAI Responses API–compatible endpoint (POST /v1/responses) to LMDeploy’s OpenAI server, including request normalization (string/messages/instructions/developer role), function tool mapping/tool-choice validation, and an SSE streaming event surface. It also updates middleware route protection, integrates the new router into api_server, and adds tests + documentation (including Codex integration docs).

Changes:

  • Add lmdeploy/serve/openai/responses.py implementing POST /v1/responses (non-stream + SSE streaming) and related request/response models.
  • Wire the new endpoint into the OpenAI API server and protect it under engine-sleep middleware.
  • Add focused unit tests plus English/Chinese documentation and integration guides (Codex / Claude Code).

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/test_lmdeploy/serve/openai/test_responses.py Adds unit coverage for input normalization, tools/tool_choice validation, response shapes, and SSE event shapes.
lmdeploy/serve/utils/server_utils.py Adds /v1/responses to sleeping-engine protected inference routes.
lmdeploy/serve/openai/responses.py Implements the Responses-compatible router, request parsing, tool conversion, non-stream response construction, and SSE streaming events.
lmdeploy/serve/openai/api_server.py Registers the new Responses router on the FastAPI app.
docs/zh_cn/llm/api_server.md Links to the new Responses endpoint documentation.
docs/zh_cn/llm/api_server_responses.md Documents the /v1/responses endpoint (Text V1 subset), tools, SSE events, and Codex setup notes.
docs/zh_cn/index.rst Adds the Responses doc page to the Chinese toctree.
docs/en/llm/api_server.md Links to the new Responses endpoint documentation.
docs/en/llm/api_server_responses.md Documents the /v1/responses endpoint and points to Codex integration docs.
docs/en/integration/codex.md Adds a Codex → LMDeploy /v1/responses integration guide.
docs/en/integration/claude_code.md Adds a Claude Code → LMDeploy /v1/messages integration guide.
docs/en/index.rst Adds the Responses doc page and a new Integrations toctree (Codex/Claude Code).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lmdeploy/serve/openai/responses.py
Comment thread lmdeploy/serve/openai/responses.py Outdated
Comment thread lmdeploy/serve/openai/responses.py
@lvhan028 lvhan028 added the enhancement New feature or request label May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants