Skip to content

fix(artifacts): Preserve .text on GcsArtifactService load (#3157)#4541

Open
wpn10 wants to merge 3 commits intogoogle:mainfrom
wpn10:fix/gcs-text-artifact-roundtrip
Open

fix(artifacts): Preserve .text on GcsArtifactService load (#3157)#4541
wpn10 wants to merge 3 commits intogoogle:mainfrom
wpn10:fix/gcs-text-artifact-roundtrip

Conversation

@wpn10
Copy link

@wpn10 wpn10 commented Feb 18, 2026

Store _adk_is_text metadata flag on GCS blobs for text artifacts and use it on load to reconstruct as Part(text=...) instead of Part.from_bytes(). Switch to get_blob() to fetch blob metadata.

Please ensure you have read the contribution guide before creating a pull request.

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

Testing Plan

Added test_save_load_text_artifact parametrized across all 3 artifact service backends (InMemory, GCS, File). Verifies .text survives round-trip and .inline_data is None.

Unit Tests:

  • I have added or updated unit tests for my change.
  • All unit tests pass locally.

pytest tests/unittests/artifacts/ -v
47 passed in 2.36s

Manual End-to-End (E2E) Tests:

Not applicable — this is an internal service fix with no UI component. The bug is fully reproducible and verifiable through unit tests.

Checklist

  • I have read the CONTRIBUTING.md document.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • I have manually tested my changes end-to-end.
  • Any dependent changes have been merged and published in downstream modules.

Additional context

Problem: GcsArtifactService._load_artifact() always uses Part.from_bytes() to reconstruct artifacts. Text artifacts saved via Part.from_text() lose their .text attribute it returns None with data only accessible through .inline_data.
Solution: Store an _adk_is_text: "true" flag in the GCS blob's custom metadata when saving text artifacts. On load, check for that flag and reconstruct as Part(text=...) instead of Part.from_bytes(...). Also switch from bucket.blob() to bucket.get_blob() so blob metadata is populated (same pattern already used in _get_artifact_version_sync). Backward compatible old blobs without the flag continue loading as before.

Store _adk_is_text metadata flag on GCS blobs for text artifacts and
use it on load to reconstruct as Part(text=...) instead of
Part.from_bytes(). Switch to get_blob() to fetch blob metadata.
@google-cla
Copy link

google-cla bot commented Feb 18, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @wpn10, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a bug in the GcsArtifactService where text artifacts would lose their '.text' attribute upon loading, forcing access through '.inline_data'. The solution introduces a custom metadata flag on GCS blobs to explicitly mark text artifacts, enabling the service to correctly reconstruct them with their original text content. This ensures data integrity and consistent behavior for text-based artifacts stored in Google Cloud Storage.

Highlights

  • Text Artifact Preservation: Implemented a mechanism to preserve the '.text' attribute of text artifacts when saved and loaded via GcsArtifactService by storing a '_adk_is_text' metadata flag on GCS blobs.
  • Artifact Reconstruction Logic: Modified the artifact loading process to check for the '_adk_is_text' metadata flag and reconstruct text artifacts using Part(text=...) instead of the generic Part.from_bytes().
  • GCS Blob Metadata Fetching: Switched from 'bucket.blob()' to 'bucket.get_blob()' in '_load_artifact' to ensure blob metadata is properly fetched and available for inspection.
  • New Unit Test: Added a new parameterized unit test, 'test_save_load_text_artifact', to verify the round-trip preservation of text artifacts across InMemory, GCS, and File artifact services.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/google/adk/artifacts/gcs_artifact_service.py
    • Added metadata '_adk_is_text: "true"' to GCS blobs when saving artifacts with a 'text' attribute.
    • Updated '_load_artifact' to use 'bucket.get_blob()' to retrieve blob metadata.
    • Modified '_load_artifact' to conditionally decode artifact bytes to text and construct 'types.Part(text=...)' if the '_adk_is_text' metadata flag is present.
  • tests/unittests/artifacts/test_artifact_service.py
    • Added a new asynchronous, parameterized test 'test_save_load_text_artifact' to validate that text artifacts retain their '.text' content after being saved and loaded across different artifact service types.
Activity
  • Added unit tests to cover the changes, specifically 'test_save_load_text_artifact' which is parameterized across all three artifact service backends (InMemory, GCS, File).
  • Confirmed that all unit tests pass locally.
  • Performed a self-review of the code.
  • Verified that the bug is fully reproducible and verifiable through unit tests, making manual end-to-end testing not applicable.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added the services [Component] This issue is related to runtime services, e.g. sessions, memory, artifacts, etc label Feb 18, 2026
@adk-bot
Copy link
Collaborator

adk-bot commented Feb 18, 2026

Response from ADK Triaging Agent

Hello @wpn10, thank you for creating this PR!

It looks like the Contributor License Agreement (CLA) has not been signed. Could you please sign it to allow us to proceed with the review? You can find more details in the "cla/google" check at the bottom of this PR.

Thanks!

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses the issue of preserving the .text attribute for text artifacts in GcsArtifactService by using blob metadata. The changes to use get_blob and add the _adk_is_text flag are well-implemented. I've added a couple of suggestions for improvement regarding an edge case with empty artifacts and enhancing the new test case.

Comment on lines 269 to 270
if not artifact_bytes:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This check incorrectly handles empty artifacts. An artifact with empty content (e.g., an empty text file) is valid, but this code would cause load_artifact to return None instead of an empty Part object. Removing this check will allow empty artifacts to be loaded correctly.

Comment on lines 780 to 800
async def test_save_load_text_artifact(service_type, artifact_service_factory):
"""Tests that text artifacts retain .text after round-trip save/load."""
artifact_service = artifact_service_factory(service_type)
artifact = types.Part.from_text(text='{"key": "value"}')

await artifact_service.save_artifact(
app_name="app0",
user_id="user0",
session_id="123",
filename="data.json",
artifact=artifact,
)
loaded = await artifact_service.load_artifact(
app_name="app0",
user_id="user0",
session_id="123",
filename="data.json",
)
assert loaded is not None
assert loaded.text == '{"key": "value"}'
assert loaded.inline_data is None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve test coverage, consider parametrizing this test to include various text contents, especially an empty string. This would help catch edge cases like handling empty artifacts.

You can add another @pytest.mark.parametrize decorator above this function, like so:

@pytest.mark.parametrize(
    "text_content",
    ['{"key": "value"}', "some other text", ""],
)

Then, update the test function to accept and use the text_content parameter.

async def test_save_load_text_artifact(service_type, artifact_service_factory, text_content):
  """Tests that text artifacts retain .text after round-trip save/load."""
  artifact_service = artifact_service_factory(service_type)
  artifact = types.Part.from_text(text=text_content)

  await artifact_service.save_artifact(
      app_name="app0",
      user_id="user0",
      session_id="123",
      filename="data.json",
      artifact=artifact,
  )
  loaded = await artifact_service.load_artifact(
      app_name="app0",
      user_id="user0",
      session_id="123",
      filename="data.json",
  )
  assert loaded is not None
  assert loaded.text == text_content
  assert loaded.inline_data is None

@ryanaiagent
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the issue of preserving the .text attribute for text artifacts stored in GCS. The solution of adding a metadata flag _adk_is_text is well-implemented, ensuring that text artifacts are correctly reconstructed upon loading. The switch to bucket.get_blob() is appropriate for fetching the necessary metadata. The accompanying tests are thorough, covering both standard and edge cases like empty text artifacts, which validates the fix. I have one minor suggestion to improve maintainability by using a constant for the metadata key.

)
elif artifact.text:
elif artifact.text is not None:
blob.metadata = {**(blob.metadata or {}), "_adk_is_text": "true"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve maintainability and avoid magic strings, consider defining "_adk_is_text" as a module-level constant, as it's used in both _save_artifact and _load_artifact.

For example, at the top of the file:

_IS_TEXT_METADATA_KEY = "_adk_is_text"

You can then use this constant here and in _load_artifact.

@ryanaiagent
Copy link
Collaborator

Hi @wpn10 , Thank you for your contribution! We appreciate you taking the time to submit this pull request. Your PR has been received by the team and is currently under review. We will provide feedback as soon as we have an update to share.

@ryanaiagent ryanaiagent added the needs review [Status] The PR/issue is awaiting review from the maintainer label Feb 19, 2026
@ryanaiagent
Copy link
Collaborator

Hi @DeanChensj , can you please review this. LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs review [Status] The PR/issue is awaiting review from the maintainer services [Component] This issue is related to runtime services, e.g. sessions, memory, artifacts, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Artifact stored as text is not retrieved as text

3 participants

Comments