Skip to content

Comments

fix(plugins): Add 20MB file size validation to SaveFilesAsArtifactsPl…#3781

Open
AakashSuresh2003 wants to merge 5 commits intogoogle:mainfrom
AakashSuresh2003:fix/save-files-artifacts-plugin-large-file-error
Open

fix(plugins): Add 20MB file size validation to SaveFilesAsArtifactsPl…#3781
AakashSuresh2003 wants to merge 5 commits intogoogle:mainfrom
AakashSuresh2003:fix/save-files-artifacts-plugin-large-file-error

Conversation

@AakashSuresh2003
Copy link

@AakashSuresh2003 AakashSuresh2003 commented Dec 2, 2025

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

Problem:
The SaveFilesAsArtifactsPlugin fails with a cryptic 400 INVALID_ARGUMENT error when users attempt to upload files larger than approximately 50MB. This error message provides no guidance to users on what went wrong or how to resolve the issue. According to the Gemini API documentation, inline_data uploads have a 20MB size limit, and files larger than this must use the Files API.

Solution:
Added file-size routing and validation to automatically handle files of all sizes. The plugin now:

  • Detects when uploads exceed the 20MB inline_data limit and routes them through Google’s Files API (supporting 20MB–2GB).
  • Converts returned Files API URIs into file_data references for LLM processing with no user intervention - The actual file size in MB
  • Proactively validates file size before processing and rejects files over the 2GB Files API limit with clear, user-friendly error messages
  • Provides detailed feedback that includes the file name, actual size, the relevant limits, and guidance on how to proceed when limits are exceeded.
  • Handles Files API failures gracefully with actionable, understandable messaging.

This prevents the cryptic 400 INVALID_ARGUMENT error from reaching users and provides actionable guidance on how to handle large files properly.

Testing Plan

test_file_size_exceeds_limit - Validates that a 50MB file is automatically uploaded via Files API instead of using inline_data
test_file_size_at_limit - Validates that a 20MB file (at the limit) uses inline_data as expected
test_file_size_just_over_limit - Validates that a 21MB file (just over the 20MB limit) is routed to Files API
test_mixed_file_sizes - Validates that multiple files with different sizes (small and large) are handled independently and correctly
test_files_api_upload_failure - Validates graceful error handling when Files API upload fails with proper error messages
test_file_exceeds_files_api_limit - Validates that a 3GB file is rejected with a clear error message about the 2GB limit, and Files API is not called

Unit Tests:

  • I have added or updated unit tests for my change.
  • All unit tests pass locally.

Please include a summary of passed pytest results.

image



Manual End-to-End (E2E) Tests:
Created a Python script to test the plugin with different file sizes:

from google.adk.plugins.save_files_as_artifacts_plugin import SaveFilesAsArtifactsPlugin
from google.adk.agents.invocation_context import InvocationContext
from google.adk.sessions.in_memory_session_service import InMemorySessionService
from google.adk.artifacts.in_memory_artifact_service import InMemoryArtifactService
from google.genai import types
from google.genai import Client
from unittest.mock import Mock, MagicMock, patch
import asyncio
import warnings


warnings.filterwarnings("ignore", category=ResourceWarning)

async def test():
    # Setup
    session_service = InMemorySessionService()
    artifact_service = InMemoryArtifactService()
    session = await session_service.create_session(app_name="test_app", user_id="test_user")
    
    context = Mock(spec=InvocationContext)
    context.app_name = "test_app"
    context.user_id = "test_user"
    context.invocation_id = "test_inv"
    context.session = session
    context.artifact_service = artifact_service
    
    plugin = SaveFilesAsArtifactsPlugin()
    
    # Test 1: 5MB file (should succeed with inline_data)
    print("=" * 70)
    print("TEST 1: Small file (5MB) - Should use inline_data")
    print("=" * 70)
    small_blob = types.Blob(display_name="test_5mb.pdf", data=b"x" * (5 * 1024 * 1024), mime_type="application/pdf")
    result1 = await plugin.on_user_message_callback(
        invocation_context=context, 
        user_message=types.Content(role="user", parts=[types.Part(inline_data=small_blob)])
    )
    print("SUCCESS: Small file (5MB) accepted and processed with inline_data")
    print(f"   Placeholder: {result1.parts[0].text if result1 else 'No result'}")
    print(f"   Parts count: {len(result1.parts) if result1 else 0}")
    print()
    
    # Test 2: 20MB file (should succeed with inline_data, at the limit)
    print("=" * 70)
    print("TEST 2: File at 20MB limit - Should use inline_data")
    print("=" * 70)
    limit_blob = types.Blob(display_name="test_20mb.pdf", data=b"x" * (20 * 1024 * 1024), mime_type="application/pdf")
    result2 = await plugin.on_user_message_callback(
        invocation_context=context, 
        user_message=types.Content(role="user", parts=[types.Part(inline_data=limit_blob)])
    )
    print("SUCCESS: 20MB file accepted (at limit) with inline_data")
    print(f"   Placeholder: {result2.parts[0].text if result2 else 'No result'}")
    print(f"   Parts count: {len(result2.parts) if result2 else 0}")
    print()
    
    # Test 3: 50MB file (should succeed with Files API upload - MOCKED)
    print("=" * 70)
    print("TEST 3: Large file (50MB) - Should use Files API (mocked)")
    print("=" * 70)
    large_blob = types.Blob(display_name="test_50mb.pdf", data=b"x" * (50 * 1024 * 1024), mime_type="application/pdf")
    
    # Mock the Files API client to avoid authentication errors
    with patch.object(Client, '__init__', return_value=None), \
         patch.object(Client, 'files') as mock_files:
        # Mock the uploaded file response
        mock_uploaded_file = MagicMock()
        mock_uploaded_file.uri = "https://generativelanguage.googleapis.com/v1beta/files/test-large-file-id"
        mock_files.upload.return_value = mock_uploaded_file
        
        result3 = await plugin.on_user_message_callback(
            invocation_context=context, 
            user_message=types.Content(role="user", parts=[types.Part(inline_data=large_blob)])
        )
        
        print("SUCCESS: Large file (50MB) uploaded via Files API")
        print(f"   Placeholder: {result3.parts[0].text if result3 else 'No result'}")
        print(f"   Parts count: {len(result3.parts) if result3 else 0}")
        if result3 and len(result3.parts) > 1:
            print(f"   File URI: {result3.parts[1].file_data.file_uri if result3.parts[1].file_data else 'N/A'}")
        print(f"   Files API upload called: {mock_files.upload.called}")
        print()
    
    # Test 4: Files API failure scenario (mocked failure)
    print("=" * 70)
    print("TEST 4: Large file with Files API failure - Should show error")
    print("=" * 70)
    huge_blob = types.Blob(display_name="test_100mb.pdf", data=b"x" * (100 * 1024 * 1024), mime_type="application/pdf")
    
    # Mock the Files API to raise an exception
    with patch("google.adk.plugins.save_files_as_artifacts_plugin.Client") as mock_client_class:
        mock_client = MagicMock()
        mock_client_class.return_value = mock_client
        mock_client.files.upload.side_effect = Exception("API quota exceeded")
        
        result4 = await plugin.on_user_message_callback(
            invocation_context=context, 
            user_message=types.Content(role="user", parts=[types.Part(inline_data=huge_blob)])
        )
        
        print("SUCCESS: Files API failure handled gracefully")
        if result4:
            error_msg = result4.parts[0].text
            print(f"   Error message preview: {error_msg[:100]}...")
        print()
    
    print("=" * 70)
    print("ALL TESTS COMPLETED SUCCESSFULLY!")
    print("=" * 70)
    print()
if __name__ == "__main__":
    asyncio.run(test())

Test Results:

Test 1: Small file upload (5MB)

======================================================================
TEST 1: Small file (5MB) - Should use inline_data
======================================================================
SUCCESS: Small file (5MB) accepted and processed with inline_data
   Placeholder: [Uploaded Artifact: "test_5mb.pdf"]
   Parts count: 1

======================================================================
TEST 2: File at 20MB limit - Should use inline_data
======================================================================
SUCCESS: 20MB file accepted (at limit) with inline_data
   Placeholder: [Uploaded Artifact: "test_20mb.pdf"]
   Parts count: 1

======================================================================
TEST 3: Large file (50MB) - Should use Files API (mocked)
======================================================================
SUCCESS: Large file (50MB) uploaded via Files API
   Placeholder: [Uploaded Artifact: "test_50mb.pdf"]
   Parts count: 2
   File URI: https://generativelanguage.googleapis.com/v1beta/files/test-large-file-id
   Files API upload called: True

======================================================================
TEST 4: Large file with Files API failure - Should show error
======================================================================
Failed to upload file test_100mb.pdf (100.00 MB) via Files API: API quota exceeded
SUCCESS: Files API failure handled gracefully
   Error message preview: [Upload Error: Failed to upload file test_100mb.pdf (100.00 MB) via Files API: API quota exceeded]...

======================================================================
ALL TESTS COMPLETED SUCCESSFULLY!
======================================================================

Result: Clear, actionable error message instead of 400 INVALID_ARGUMENT.

Checklist

  • I have read the CONTRIBUTING.md document.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • I have manually tested my changes end-to-end.
  • Any dependent changes have been merged and published in downstream modules.

Additional context

Implementation Details:

  • Added constant _MAX_INLINE_DATA_SIZE_BYTES = 20 * 1024 * 1024 (20MB) for inline_data limit
  • Added constant _MAX_FILES_API_SIZE_BYTES = 2 * 1024 * 1024 * 1024 (2GB) for Files API limit
  • Implemented file size routing:
    • Files ≤20MB: Use inline_data (fast path, no upload needed)
    • Files 20MB-2GB: Automatically upload via Files API
    • Files >2GB: Reject with clear error message (Files API hard limit)
  • File size validation occurs before any processing to fail fast
  • Error messages include file name and size for clarity
  • Works for ALL file types (PDF, images, videos, audio, text, JSON, etc.)
  • Multiple files in a single message are handled independently
  • Files API upload errors are handled gracefully with user-friendly messages

Changes Made:

  1. src/google/adk/plugins/save_files_as_artifacts_plugin.py:

    • Added Files API integration for files >20MB
    • Added 2GB validation check before Files API upload
    • Implemented automatic file size routing logic
    • Added comprehensive error handling and logging
  2. tests/unittests/plugins/test_save_files_as_artifacts.py:

    • Added test_file_size_exceeds_limit - Validates 50MB file uploads via Files API
    • Added test_file_size_at_limit - Validates 20MB file uses inline_data
    • Added test_file_size_just_over_limit - Validates 21MB file uses Files API
    • Added test_mixed_file_sizes - Validates mixed sizes handled independently
    • Added test_files_api_upload_failure - Validates error handling for Files API failures
    • Added test_file_exceeds_files_api_limit - Validates 3GB file rejected with 2GB limit
    • All 16 tests passing

Reference:

  • Gemini API Files documentation: https://ai.google.dev/gemini-api/docs/files
  • Files API limits:
    • Maximum file size: 2GB per file
    • Maximum storage: 20GB total
    • File retention: 48 hours (automatically deleted after)
    • Supported file types: All common formats (PDF, images, videos, audio, text, etc.)

Before this fix:
Users received: 400 INVALID_ARGUMENT: {'error': {'code': 400, 'message': 'Request contains an invalid argument.', 'status': 'INVALID_ARGUMENT'}}

After this fix:

  • Files 20MB-2GB: Automatically uploaded via Files API with success message
  • Files >2GB: Clear error message [Upload Error: File huge_video.mp4 (3.00 GB) exceeds the maximum supported size of 2GB. Please upload a smaller file.]
  • Files API failures: Graceful error handling with context about what went wrong

@google-cla
Copy link

google-cla bot commented Dec 2, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @AakashSuresh2003, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the SaveFilesAsArtifactsPlugin by introducing a crucial file size validation mechanism. Previously, users attempting to upload files larger than 20MB via inline data would encounter an uninformative backend error. The change now proactively checks file sizes, providing immediate and actionable feedback to users, guiding them to utilize the Files API for larger uploads and significantly improving the overall user experience by preventing frustrating errors.

Highlights

  • Proactive File Size Validation: Implemented client-side validation in the SaveFilesAsArtifactsPlugin to check if uploaded files exceed the 20MB inline data limit before processing them.
  • Improved Error Messaging: Replaced cryptic 400 INVALID_ARGUMENT errors with clear, user-friendly messages when files exceed the 20MB limit. The new message includes the actual file size, the limit, instructions to use the Files API, and a link to relevant documentation.
  • Comprehensive Testing: Added four new unit tests to cover scenarios including files exceeding the limit, files exactly at the limit, files just over the limit, and mixed file sizes, ensuring robust validation behavior.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added the core [Component] This issue is related to the core interface and implementation label Dec 2, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the issue of unhandled large file uploads in SaveFilesAsArtifactsPlugin. By introducing a 20MB file size validation, it prevents cryptic errors and provides clear, actionable feedback to the user. The implementation is straightforward, and the accompanying unit tests are comprehensive, covering various edge cases. I have one minor suggestion to improve the maintainability of the error message construction. Overall, this is a great fix.

@AakashSuresh2003 AakashSuresh2003 force-pushed the fix/save-files-artifacts-plugin-large-file-error branch from 64a1208 to 85f7c7c Compare December 2, 2025 10:14
@AakashSuresh2003 AakashSuresh2003 deleted the fix/save-files-artifacts-plugin-large-file-error branch December 2, 2025 10:18
@AakashSuresh2003 AakashSuresh2003 restored the fix/save-files-artifacts-plugin-large-file-error branch December 2, 2025 10:19
@ryanaiagent ryanaiagent self-assigned this Dec 3, 2025
@AakashSuresh2003
Copy link
Author

Hi @ryanaiagent,

Could you please review my PR and provide any suggestions?

@GWeale
Copy link
Collaborator

GWeale commented Dec 4, 2025

Hi @AakashSuresh2003 I am not sure if we want to impose a 20MB limit, I was looking at this problem and thinking about using google file storage api. I think this would be a better direction if it works!

@ryanaiagent ryanaiagent added request clarification [Status] The maintainer need clarification or more information from the author services [Component] This issue is related to runtime services, e.g. sessions, memory, artifacts, etc and removed core [Component] This issue is related to the core interface and implementation labels Dec 4, 2025
@AakashSuresh2003
Copy link
Author

Sure, Will start working on this direction

@AakashSuresh2003
Copy link
Author

AakashSuresh2003 commented Dec 10, 2025

Hi @GWeale ,

could you please review my PR when you have time?

@ryanaiagent
Copy link
Collaborator

Hi @AakashSuresh2003 , we appreciate your patience and support. Can you please fix the failing unit tests before we can proceed with the review.

@ryanaiagent
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the issue of handling large file uploads in the SaveFilesAsArtifactsPlugin by introducing file size validation and routing to the Google Files API for files exceeding the inline data limit. The implementation is logical, and the addition of comprehensive unit tests is commendable. I've provided a few suggestions to enhance maintainability and robustness, including using the mimetypes library for better file extension handling, removing a hardcoded value in an error message, and refactoring a unit test to adhere to best practices.

@ryanaiagent
Copy link
Collaborator

Hi @AakashSuresh2003 , can you please address the suggestions.

@AakashSuresh2003
Copy link
Author

Sure @ryanaiagent

AakashSuresh2003 added a commit to AakashSuresh2003/adk-python that referenced this pull request Feb 21, 2026
AakashSuresh2003 added a commit to AakashSuresh2003/adk-python that referenced this pull request Feb 21, 2026
AakashSuresh2003 added a commit to AakashSuresh2003/adk-python that referenced this pull request Feb 21, 2026
AakashSuresh2003 added a commit to AakashSuresh2003/adk-python that referenced this pull request Feb 21, 2026
AakashSuresh2003 added a commit to AakashSuresh2003/adk-python that referenced this pull request Feb 22, 2026
AakashSuresh2003 added a commit to AakashSuresh2003/adk-python that referenced this pull request Feb 22, 2026
AakashSuresh2003 added a commit to AakashSuresh2003/adk-python that referenced this pull request Feb 22, 2026
@AakashSuresh2003
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable fix for handling large file uploads in SaveFilesAsArtifactsPlugin. By routing files larger than 20MB through the Google Files API, it addresses a limitation of inline data uploads and provides much clearer error handling for users. The changes are well-structured and include comprehensive unit tests for various file size scenarios.

My review focuses on a few areas to improve the implementation:

  • A misleading log message that could cause confusion during debugging.
  • A performance improvement by avoiding repeated instantiation of the Client.
  • Removing a hardcoded value from an error message to improve maintainability.

Overall, this is a solid contribution that significantly improves the user experience for file uploads.

@AakashSuresh2003 AakashSuresh2003 force-pushed the fix/save-files-artifacts-plugin-large-file-error branch 3 times, most recently from aa5d6a4 to e99625c Compare February 22, 2026 09:42
@AakashSuresh2003
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the SaveFilesAsArtifactsPlugin to support larger file uploads by integrating with the Gemini Files API. Previously, files were handled as inline_data, but now files exceeding 20MB will be uploaded via the Files API, while files over 2GB will be rejected with an error message. The changes include adding os, tempfile, and google.genai.Client imports, defining new constants for maximum inline data and Files API sizes, and implementing a new _upload_to_files_api method to manage temporary file creation, upload, and cleanup. The on_user_message_callback method was updated to incorporate file size checks and conditionally route files to either the existing inline data handling or the new Files API upload process, including error handling for upload failures. Corresponding unit tests were added to cover various scenarios, including files exceeding the 20MB limit, files at the limit, mixed file sizes, Files API upload failures, and files exceeding the 2GB limit. Review comments suggest minor improvements such as reordering imports, making file size calculations more robust and efficient by avoiding repetition, promoting a mime_to_ext dictionary to a class-level constant, and refining test patching strategies for clarity and robustness.

…eFilesAsArtifactsPlugin with robust error handling and improved test mocking (google#3781)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

request clarification [Status] The maintainer need clarification or more information from the author services [Component] This issue is related to runtime services, e.g. sessions, memory, artifacts, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SaveFilesAsArtifactsPlugin: Can't handle too big files (PDFs)

4 participants