Skip to content

voice: output retries for run(output_type=...)#6080

Open
theomonnom wants to merge 18 commits into
mainfrom
theo/output-retries
Open

voice: output retries for run(output_type=...)#6080
theomonnom wants to merge 18 commits into
mainfrom
theo/output-retries

Conversation

@theomonnom

@theomonnom theomonnom commented Jun 12, 2026

Copy link
Copy Markdown
Member

A run with an output_type ends with final_output=None whenever the model finishes its turn in prose instead of calling the task's completion tool — common with chatty models, and currently surfaced as a generic RuntimeError that callers can't distinguish or recover from.

Following pydantic-ai's output-tool semantics:

  • New output_options on run() (an options TypedDict in the style of keyterm_options/expressiveness): when the run ends without its output_type, the session re-prompts in the same context as a per-turn system message (max_retries, default 2) before raising; retry_instructions overrides the built-in retry prompt.

    result = await sess.run(
        user_input=...,
        output_type=SummarizeOutput,
        output_options={"max_retries": 2, "retry_instructions": "Call submit_analysis, nothing else."},
    )
  • A distinct UnexpectedModelBehavior (exported from livekit.agents, same name as pydantic-ai's) replaces the generic RuntimeError once the budget is exhausted, so callers can catch the failure specifically.

Defaults convert the dominant failure (model summarizes in prose) into a recovered run. Unit tests cover recovery, the prompt override, and exhaustion via FakeLLM.

🤖 Generated with Claude Code

@theomonnom theomonnom requested a review from a team as a code owner June 12, 2026 17:52

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 new potential issues.

Open in Devin Review

Comment on lines +595 to +598
output_retries: int | OutputRetryOptions = 1,
) -> RunResult[Run_T]:
"""output_retries: how many times to re-prompt the model when the run
ends without the expected output_type before raising RunOutputError;

@devin-ai-integration devin-ai-integration Bot Jun 12, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Default output_retries mismatch between RunResult constructor and AgentSession.run()

The RunResult.__init__ default for output_retries is 1 (run_result.py:82), but AgentSession.run() always passes output_options.get('max_retries', 2) (agent_session.py:623), defaulting to 2. This means direct construction of RunResult (e.g., at agent_session.py:854 for capture_run) gets 1 retry, while runs through session.run() get 2. The capture_run path at line 854 doesn't set output_type, so retries are irrelevant there, but the inconsistency could be confusing for any future code path that constructs RunResult directly with an output_type.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional: one silent recovery is the desired out-of-box behavior (it converts the dominant failure into a recovered run, matching pydantic-ai's default of 1 output retry), and the latency cost only occurs on runs that would previously have failed outright. The exception change is called out in the PR description; output_options={"retries": 0} restores fail-fast.

Comment thread livekit-agents/livekit/agents/voice/run_result.py
Comment thread livekit-agents/livekit/agents/voice/run_result.py
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@theomonnom theomonnom force-pushed the theo/output-retries branch from c63462a to 86acdca Compare June 12, 2026 23:00
user_input: str,
input_modality: Literal["text", "audio"] = "text",
output_type: type[Run_T] | None = None,
output_options: RunOutputOptions | None = None,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this include NOT_GIVEN so None can be used to disable the retry behavior? otherwise we have to type {"max_retries": 0} to disable it explicitly.

run_state = RunResult(
user_input=user_input,
output_type=output_type,
output_retries=output_options.get("max_retries", 2),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: we could follow the _resolve* pattern here to have explicit default value(s).

user_input: str | None = None,
output_type: type[Run_T] | None,
output_retries: int = 1,
output_retry_instructions: str | None = None,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: should we just pass the output options here so default values and resolution can stay in one place?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants