Skip to content

fix(handoffs): enforce strict Pydantic validation when strict_json_schema=True#3724

Open
Om-Borse26 wants to merge 2 commits into
openai:mainfrom
Om-Borse26:fix/handoff-strict-validation
Open

fix(handoffs): enforce strict Pydantic validation when strict_json_schema=True#3724
Om-Borse26 wants to merge 2 commits into
openai:mainfrom
Om-Borse26:fix/handoff-strict-validation

Conversation

@Om-Borse26

Copy link
Copy Markdown

Summary

Fixes #3723

The Handoff class explicitly sets strict_json_schema=True (its default behavior) which instructs the LLM to adhere to a strict schema for the output type. However, during runtime validation in _invoke_handoff, the validate_json utility is called without strict=True. As a result, Pydantic defaults to its lenient mode and silently coerces invalid types, entirely ignoring the strictness intent of the handoff definition.

This PR passes strict=True to the validation logic when handling handoff inputs, ensuring that the runtime validation strictness matches the schema provided to the LLM.

Demonstration

Before (Silent Coercion)
When an LLM incorrectly passed "25" (string) instead of 25 (int) for an age field requiring an integer, the lenient parsing silently accepted and coerced it:

[1] CURRENT BEHAVIOR -- SDK validate_json (no strict=True)
    Input JSON: {"name": "Alice", "age": "25"}
    age is a STRING "25", not an integer 25

    ACCEPTED: validate_json did NOT raise an error
    result.name = 'Alice'  (type: str)
    result.age  = 25  (type: int)

After (ValidationError)
With strict=True, Pydantic correctly rejects the mismatch:

[2] EXPECTED BEHAVIOR -- type_adapter.validate_json with strict=True
    Input JSON: {"name": "Alice", "age": "25"}

    REJECTED: ValidationError raised (correct -- strict mode rejects str for int)
       type='int_type', msg='Input should be a valid integer'
       input='25'

Verification

You can verify this fix directly via the new test rather than a raw script:

uv run pytest tests/test_handoff_tool.py::test_handoff_strict_json_rejects_type_coercion

(A secondary test test_handoff_lenient_json_allows_type_coercion was also added to ensure backward compatibility for other callers using lenient mode).

(Note: AI-assisted contribution)

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0edfac8098

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/util/_json.py Outdated
@Om-Borse26

Copy link
Copy Markdown
Author

Great catch! You are entirely correct—by passing strict=False in the utility wrapper when the caller omitted it, I was inadvertently overriding Pydantic's model-level ConfigDict(strict=True) setting for other callers.

I have updated the validate_json signature to use strict: bool | None = None and only pass the parameter through to Pydantic if it is not None. This preserves Pydantic's default behavior for existing callers, while still allowing the Handoff call site to explicitly enforce strict=True.

I've pushed the fix and updated the regression tests to verify that omitting the parameter still preserves Pydantic's default lenient coercion behavior for a model without strict config. Thanks for flagging this subtle regression!

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d31bc497dc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/handoffs/__init__.py
@Om-Borse26 Om-Borse26 force-pushed the fix/handoff-strict-validation branch from d31bc49 to 711d7ea Compare July 1, 2026 20:41
@Om-Borse26

Copy link
Copy Markdown
Author

Thanks for the sharp eyes on both of these. Fixed in the latest push.
For the _json.py default — changed it to Optional[bool] = None and only forward strict to Pydantic when it's explicitly set. That way callers that don't pass it won't accidentally override any model-level ConfigDict the user has set.
For the realtime path — added strict=True there as well since realtime_handoff() already forces ensure_strict_json_schema, so the runtime should match what the LLM was told.
Also noticed while doing the sweep that agent_output.py had the same gap — it has self._strict_json_schema available but wasn't passing it through. Folded that in too.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 711d7ea250

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/agent_output.py Outdated
@Om-Borse26 Om-Borse26 force-pushed the fix/handoff-strict-validation branch from 711d7ea to 9679c26 Compare July 1, 2026 20:45
@Om-Borse26

Copy link
Copy Markdown
Author

Another great catch! Passing strict=False in agent_output.py would indeed override model-level ConfigDict settings just like it would have in the helper function.

I have updated the logic in agent_output.py to pass strict=True if self._strict_json_schema else None, ensuring that None is passed when strict_json_schema is False. The other two call sites (realtime_handoff and the base Handoff) correctly pass a hardcoded strict=True because they enforce strict schema generation on their inputs before reaching validation.

Latest push incorporates this fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handoff silently ignores strict_json_schema and coerces types

1 participant