Skip to content

.Net: Fix Gemini streaming token usage metrics#13944

Open
MohamedOthman1 wants to merge 4 commits intomicrosoft:mainfrom
MohamedOthman1:bugfix/13382-gemini-stream-usage
Open

.Net: Fix Gemini streaming token usage metrics#13944
MohamedOthman1 wants to merge 4 commits intomicrosoft:mainfrom
MohamedOthman1:bugfix/13382-gemini-stream-usage

Conversation

@MohamedOthman1
Copy link
Copy Markdown

@MohamedOthman1 MohamedOthman1 commented May 1, 2026

Motivation and Context

Fixes #13382.

Gemini streaming responses include cumulative usage metadata. The current connector records that metadata for every chunk, so a streamed response can inflate token counters by the number of chunks in the stream.

Description

This update keeps streaming chunks flowing as before, but suppresses usage logging while each chunk is parsed. Once the stream finishes normally, the connector records usage a single time from the latest chunk that actually included token usage metadata.

That keeps the non-streaming path unchanged and avoids losing metrics if Gemini ever sends a final chunk without usageMetadata. If the stream is cancelled or fails before normal completion, usage is not emitted for the partial response.

The added regression test uses repeated token counts across multiple stream chunks, followed by a final chunk without usage metadata, and verifies each Google token counter records the expected value once.

Pre-submit Notes

  • Added focused regression coverage for repeated Gemini streaming usage metadata.
  • Kept the change scoped to the Google Gemini connector and its unit tests.
  • Checked the patch for whitespace issues with git diff --check.

@MohamedOthman1 MohamedOthman1 requested a review from a team as a code owner May 1, 2026 07:24
Copilot AI review requested due to automatic review settings May 1, 2026 07:24
@moonbox3 moonbox3 added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel labels May 1, 2026
@github-actions github-actions Bot changed the title Fix Gemini streaming token usage metrics .Net: Fix Gemini streaming token usage metrics May 1, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 92% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Design Approach


Automated review by MohamedOthman1's agents

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes inflated OpenTelemetry token usage metrics for the Gemini streaming connector by ensuring accumulated usage metadata is emitted only once per streamed response (instead of once per chunk), aligning streaming behavior with Gemini’s accumulated usage reporting.

Changes:

  • Update Gemini streaming response processing to suppress per-chunk usage logging and log usage once after streaming completes, using the last chunk that contains usage metadata.
  • Refactor ProcessChatResponse to optionally skip usage logging so it can be reused for streaming without side effects.
  • Add a unit test that validates prompt/completion/total token counters are each emitted exactly once for a multi-chunk stream (including a final chunk without usage metadata).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
dotnet/src/Connectors/Connectors.Google/Core/Gemini/Clients/GeminiChatCompletionClient.cs Prevents duplicate token metrics in streaming by deferring usage logging until stream completion.
dotnet/src/Connectors/Connectors.Google.UnitTests/Core/Gemini/Clients/GeminiChatStreamingTests.cs Adds regression coverage to ensure streaming usage metrics are emitted once even when the final chunk lacks usage metadata.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@MohamedOthman1
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: .NET: Gemini connector emits duplicate token usage metrics during streaming (similar to #12977 for Python/OpenAI)

3 participants