feat(cerebras, xai): expose reasoning_format for parity with reasoning_effort by anshulkulhari7 · Pull Request #6106 · livekit/agents

anshulkulhari7 · 2026-06-15T09:58:56Z

Summary

Closes #5989.

Reasoning models such as gpt-oss-120b served by Cerebras or xAI (Grok) stream their thinking tokens as part of the chat-completions response. Today there is no way to tell those providers to keep that internal monologue out of the message content, so the TTS pipeline ends up reading the model's raw reasoning aloud.

Both providers support a reasoning_format request field ("parsed", "raw", "hidden") that controls where reasoning ends up. This PR exposes it on the relevant LLM constructors, mirroring exactly how reasoning_effort is already plumbed.

Changes

livekit-plugins-openai (llm.py)
- Add a ReasoningFormat = Literal["parsed", "raw", "hidden"] type.
- Add reasoning_format to _LLMOptions, the LLM.__init__ signature, and the LLM.with_x_ai(...) factory (xAI/Grok chat completions path).
- In chat(), forward it to the request. Because reasoning_format is a provider-specific body field and not an OpenAI SDK keyword argument (passing it top-level raises TypeError: unexpected keyword argument), it is merged into extra_body — the same mechanism already used for OpenRouter-specific body fields. The user-supplied extra_body dict is never mutated in place.
livekit-plugins-cerebras (llm.py)
- Add reasoning_format to the Cerebras LLM.__init__ and forward it to the base class.

The change is additive and backward-compatible: when reasoning_format is not set, nothing is added to the request.

How `reasoning_format` reaches the API

reasoning_effort is a named param in the OpenAI Python SDK (AsyncCompletions.create), so it is passed top-level. reasoning_format is not, so it must travel through extra_body. Verified empirically against openai==2.40.0:

top-level reasoning_format: TypeError -> AsyncCompletions.create() got an unexpected keyword argument 'reasoning_format'
extra_body call body has reasoning_format: True

Accepted values follow the Cerebras reasoning docs (parsed / raw / hidden); xAI/Grok uses the same field.

Testing

Added tests/test_plugin_reasoning_format.py (unit, no live keys — mocks nothing beyond constructing the LLM and inspecting the request kwargs the stream would send):

reasoning_format set on Cerebras LLM lands in extra_body of the outgoing request.
reasoning_format set via LLM.with_x_ai(...) lands in extra_body.
When unset, no reasoning_format is added.

$ uv run pytest tests/test_plugin_reasoning_format.py --unit
tests/test_plugin_reasoning_format.py::test_cerebras_reasoning_format_in_request PASSED
tests/test_plugin_reasoning_format.py::test_cerebras_reasoning_format_omitted_by_default PASSED
tests/test_plugin_reasoning_format.py::test_xai_reasoning_format_in_request PASSED
3 passed

Quality gates on the changed files:

uv run ruff check — All checks passed
uv run ruff format --check — already formatted
uv run mypy -p livekit.plugins.openai -p livekit.plugins.cerebras (strict) — Success: no issues found
Full uv run pytest --unit gate: no new failures (the pre-existing harness errors in tests/concurrency.py reproduce identically on a clean upstream/main checkout).

AI-assisted: this change was prepared with AI assistance and reviewed by the author.

…g_effort Reasoning models such as gpt-oss-120b served by Cerebras or xAI (Grok) stream their thinking tokens as part of the response. Without a way to suppress them, the TTS pipeline reads the model's internal monologue aloud. Add a reasoning_format parameter to the OpenAI-compatible LLM, the Cerebras LLM, and LLM.with_x_ai, mirroring how reasoning_effort is plumbed. Because reasoning_format is a provider-specific body field (not an OpenAI SDK argument), it is forwarded via extra_body. Accepted values are 'parsed', 'raw', and 'hidden'. Closes livekit#5989

The with_cerebras() factory accepted reasoning_effort but not reasoning_format, so users could not pass it even though the LLM docstring lists Cerebras (gpt-oss-120b) as a supported provider. Add the parameter to the signature and forward it to LLM(), matching with_x_ai() and the standalone Cerebras plugin.

devin-ai-integration

Devin Review found 1 new potential issue.

devin-ai-integration · 2026-06-15T14:09:07Z

🚩 Caller-provided extra_body in extra_kwargs is silently overridden by opts.extra_body

Pre-existing behavior: if a caller passes extra_kwargs={"extra_body": {...}} to chat() AND the LLM was constructed with an extra_body option, lines 975-980 first apply extra_kwargs then unconditionally overwrite extra["extra_body"] with self._opts.extra_body. This means the caller's extra_body is silently lost. This is not introduced by this PR (it's pre-existing), but the new reasoning_format feature makes it more likely users will interact with extra_body indirectly. Currently no callers in the codebase appear to hit this conflict, but it could surprise external users.

(Refers to lines 975-980)

Was this helpful? React with 👍 or 👎 to provide feedback.

davidzhao · 2026-06-15T16:20:05Z

        ``api_key`` must be set to your OpenAI API key, either using the argument or by setting the
        ``OPENAI_API_KEY`` environmental variable.
+
+        ``reasoning_format`` controls how reasoning models (e.g. ``gpt-oss-120b`` served by


nit: reword the description. it makes it sound like gpt-oss-120b can be served by xAI

Reworded in 258fde2 — now reads "gpt-oss-120b on Cerebras, or Grok on xAI", so it no longer implies xAI serves gpt-oss-120b.

davidzhao · 2026-06-15T16:20:45Z

        ``OPENAI_API_KEY`` environmental variable.
+
+        ``reasoning_format`` controls how reasoning models (e.g. ``gpt-oss-120b`` served by
+        Cerebras or xAI) return their thinking tokens. Set it to ``"hidden"`` or ``"parsed"`` to


how do we handle parsed? if we are parsing it, there should be a way to expose it to the end user.

Good catch — right now the plugin doesn't surface the parsed reasoning. ChoiceDelta only has role/content/tool_calls/extra (no dedicated reasoning field), and the OpenAI stream handler doesn't read a reasoning/reasoning_content field, so with reasoning_format="parsed" the separated reasoning would currently be dropped.

My instinct is to surface it through delta.extra (the same provider-extra channel already used for xAI encrypted reasoning) rather than adding a new field — but how would you prefer reasoning to be exposed to the end user? Happy to wire it up that way in this PR, or drop "parsed" here and do the exposure as a focused follow-up if you'd rather keep this one to the request-param plumbing.

…b on Cerebras, Grok on xAI)

anshulkulhari7 requested a review from a team as a code owner June 15, 2026 09:58

This comment was marked as resolved.

Sign in to view

devin-ai-integration Bot reviewed Jun 15, 2026

View reviewed changes

davidzhao reviewed Jun 15, 2026

View reviewed changes

docs(openai): clarify reasoning_format provider examples (gpt-oss-120…

258fde2

…b on Cerebras, Grok on xAI)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cerebras, xai): expose reasoning_format for parity with reasoning_effort#6106

feat(cerebras, xai): expose reasoning_format for parity with reasoning_effort#6106
anshulkulhari7 wants to merge 3 commits into
livekit:mainfrom
anshulkulhari7:feat/5989-reasoning-format

anshulkulhari7 commented Jun 15, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Jun 15, 2026

Uh oh!

davidzhao Jun 15, 2026

Uh oh!

anshulkulhari7 Jun 16, 2026

Uh oh!

davidzhao Jun 15, 2026

Uh oh!

anshulkulhari7 Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

anshulkulhari7 commented Jun 15, 2026

Summary

Changes

How reasoning_format reaches the API

Testing

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

davidzhao Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

anshulkulhari7 Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

davidzhao Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

anshulkulhari7 Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

How `reasoning_format` reaches the API