Skip to content

[FR]: Support service_tier configuration (Priority Inference) #4

@tw0b33rs

Description

@tw0b33rs

Is there an existing issue for this?

  • I have searched the existing issues

Describe the problem

The AppFunctions Testing Agent's Gemini API calls exhibit high latency variance (2–20 seconds per interaction turn). This makes the tool impractical for live demonstrations and interactive showcases. The variance comes from Standard tier scheduling, where requests can be deprioritized during peak load.

Describe the solution

Requested Change

Add a configurable service_tier field to the Gemini API request body, exposed as a user-facing setting in the app's settings screen.

Technical Context

The agent currently omits the service_tier field, defaulting to Standard tier.
Google's Priority Inference documentation specifies that adding "service_tier": "priority" to the request body routes traffic to high-priority compute queues, delivering:

  • Consistent "seconds" latency (vs "seconds to minutes" for Standard)
  • Non-droppable traffic (no deprioritization under load)
  • Graceful degradation to Standard if priority limits are exceeded (no failures)

Suggested Implementation

  1. Add a setting/dropdown in the Settings screen with three options: Standard (default), Priority, Flex
  2. Pass the selected value as "service_tier" in the JSON request body
  3. Optionally: surface the x-gemini-service-tier response header somewhere in the UI or logs, so users can verify whether a request was actually served at priority tier or downgraded

Why This Matters

The AppFunctions Testing Agent is the primary tool for demonstrating AppFunctions to stakeholders. The up to 20 second latency variance on Standard tier undermines confidence in the technology during live demos. Priority Inference is specifically designed for this use case ("interactive AI applications" per Google's docs) and would make the testing agent viable for real-time showcases.

Additional context

Prerequisites for Users

  • Tier 2+ billing account
  • Priority tier costs 75–100% more per token than Standard

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions