Skip to content

feat(pydantic-ai): Support ImageUrl content type in span instrumentation#5629

Merged
ericapisani merged 9 commits intomasterfrom
ep/pydantic-ai-support-base64-images-in-url-46s
Mar 11, 2026
Merged

feat(pydantic-ai): Support ImageUrl content type in span instrumentation#5629
ericapisani merged 9 commits intomasterfrom
ep/pydantic-ai-support-base64-images-in-url-46s

Conversation

@ericapisani
Copy link
Member

@ericapisani ericapisani commented Mar 10, 2026

Fixes PY-2129 and #5627

Add handling for the pydantic-ai ImageUrl message content type in span instrumentation.

Previously, only BinaryContent was handled for non-text message parts. With recent pydantic-ai versions, users can pass ImageUrl objects as part of their prompts. Without handling this type, ImageUrl items would fall through to safe_serialize, losing structured information about the content.

Add handling for the pydantic-ai `ImageUrl` message content type in the
pydantic-ai integration. For data URLs containing base64-encoded images,
the content is redacted and replaced with a placeholder to avoid sending
large binary payloads to Sentry. For regular HTTP URLs, the URL string is
preserved as-is.

Refactor binary content serialization into shared helper functions
`_serialize_binary_content_item` and `_serialize_image_url_item` in
`spans/utils.py` to remove duplication between `ai_client.py` and
`invoke_agent.py`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@linear-code
Copy link

linear-code bot commented Mar 10, 2026

@github-actions
Copy link
Contributor

github-actions bot commented Mar 10, 2026

Semver Impact of This PR

🟡 Minor (new features)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).


New Features ✨

Pydantic Ai

  • Support ImageUrl content type in span instrumentation by ericapisani in #5629
  • Add tool description to execute_tool spans by ericapisani in #5596

Other

  • (crons) Add owner field to MonitorConfig by julwhitney13 in #5610

Bug Fixes 🐛

  • (celery) Propagate user-set headers by sentrivana in #5581
  • (utils) Avoid double serialization of strings in safe_serialize by ericapisani in #5587

Documentation 📚

  • (openai-agents) Remove inapplicable comment by alexander-alderman-webb in #5495
  • Add AGENTS.md by sentrivana in #5579
  • Add set_attribute example to changelog by sentrivana in #5578

Internal Changes 🔧

Openai Agents

  • Do not fail on new tool fields by alexander-alderman-webb in #5625
  • Stop expecting a specific function name by alexander-alderman-webb in #5623
  • Set streaming header when library uses with_streaming_response() by alexander-alderman-webb in #5583
  • Replace mocks with httpx for streamed responses by alexander-alderman-webb in #5580
  • Replace mocks with httpx in non-MCP tool tests by alexander-alderman-webb in #5602
  • Replace mocks with httpx in MCP tool tests by alexander-alderman-webb in #5605
  • Replace mocks with httpx in handoff tests by alexander-alderman-webb in #5604
  • Replace mocks with httpx in API error test by alexander-alderman-webb in #5601
  • Replace mocks with httpx in non-error single-response tests by alexander-alderman-webb in #5600
  • Remove test for unreachable state by alexander-alderman-webb in #5584
  • Expect namespace tool field for new openai versions by alexander-alderman-webb in #5599

Other

  • (httpx) Resolve type checking failures by alexander-alderman-webb in #5626
  • (pyramid) Support alpha suffixes in version parsing by alexander-alderman-webb in #5618
  • Remove CodeQL action by sentrivana in #5616
  • Normalize dots in package names in populate_tox.py by alexander-alderman-webb in #5574
  • Do not run actions on potel-base by sentrivana in #5614

🤖 This preview updates automatically when you update the PR.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 10, 2026

Codecov Results 📊

32 passed | Total: 32 | Pass Rate: 100% | Execution Time: 294ms

All tests are passing successfully.

❌ Patch coverage is 0.00%. Project has 15215 uncovered lines.

Files with missing lines (4)
File Patch % Lines
ai_client.py 0.00% ⚠️ 147 Missing
invoke_agent.py 0.00% ⚠️ 81 Missing
utils.py 0.00% ⚠️ 27 Missing
consts.py 0.00% ⚠️ 3 Missing

Generated by Codecov Action

@ericapisani ericapisani marked this pull request as ready for review March 10, 2026 15:49
@ericapisani ericapisani requested a review from a team as a code owner March 10, 2026 15:49
The regex used to detect and redact base64 data URLs only allowed
alphabetic characters in MIME types, causing it to fail for types like
`image/svg+xml`, `application/vnd.ms-excel`, or `font/woff2`.

When the match failed, the full raw data URL (including base64 content)
was passed through to Sentry instead of being redacted with
BLOB_DATA_SUBSTITUTE, resulting in unintended data leakage.

Expand the MIME type character class to include digits, `.`, `+`, and
`-` to match all common MIME types per RFC 2045.

Co-Authored-By: Claude <noreply@anthropic.com>
Cover the case where data URLs include optional parameters between the
MIME type and base64 encoding, e.g. `data:image/png;name=file.png;base64,...`
and `data:text/plain;charset=utf-8;name=hello.txt;base64,...`. These should
be matched and redacted by DATA_URL_BASE64_REGEX.

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Unused imports of BinaryContent and ImageUrl in utils
    • Removed unused BinaryContent and ImageUrl imports from spans/utils.py as they were never referenced in the file.

Create PR

Or push these changes by commenting:

@cursor push 85791c9981
Preview (85791c9981)
diff --git a/sentry_sdk/integrations/pydantic_ai/spans/utils.py b/sentry_sdk/integrations/pydantic_ai/spans/utils.py
--- a/sentry_sdk/integrations/pydantic_ai/spans/utils.py
+++ b/sentry_sdk/integrations/pydantic_ai/spans/utils.py
@@ -13,13 +13,7 @@
     from typing import Union, Dict, Any, List, Optional
     from pydantic_ai.usage import RequestUsage, RunUsage  # type: ignore
 
-try:
-    from pydantic_ai.messages import BinaryContent, ImageUrl  # type: ignore
-except ImportError:
-    BinaryContent = None
-    ImageUrl = None
 
-
 def _serialize_image_url_item(item: "Any") -> "Dict[str, Any]":
     """Serialize an ImageUrl content item for span data.

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

Remove unused imports (BinaryContent, ImageUrl, Optional, List) from
utils.py and add explicit assertion in test to ensure image content is
actually found in messages data rather than silently passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@alexander-alderman-webb alexander-alderman-webb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! Two points of feedback:

  • We can remove another mime_type field, since there's not been a request for the SDK to send this information 😅
  • We can write the tests in a way that reduces future work for us.

Could you also PR to the AI Agents Insight module devdocs to document the regex 🙏
Include a sentence about possible cases in image URLs and the regex that you suggest other SDKs to follow as well!

…idate tests

Remove the mime_type field from ImageUrl serialization in spans since
it is not needed for the base64 redaction use case. Update the regex
to use non-capturing groups accordingly. Consolidate scattered image
URL tests into two parameterized test functions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
In Pydantic v2, ImageUrl.url is a Url object, not a string. Passing it
directly to re.match() raises TypeError at runtime. Convert to string
first, then reuse for both the regex match and the return value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@alexander-alderman-webb alexander-alderman-webb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!
Just one line in the test that looks strange ...

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

@ericapisani ericapisani merged commit 5379537 into master Mar 11, 2026
154 of 158 checks passed
@ericapisani ericapisani deleted the ep/pydantic-ai-support-base64-images-in-url-46s branch March 11, 2026 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants