Skip to content

Python: surface cache and reasoning token counts for the Bedrock and Gemini connectors#6640

Merged
eavanvalkenburg merged 6 commits into
microsoft:mainfrom
he-yufeng:fix/bedrock-usage-cache-tokens
Jun 24, 2026
Merged

Python: surface cache and reasoning token counts for the Bedrock and Gemini connectors#6640
eavanvalkenburg merged 6 commits into
microsoft:mainfrom
he-yufeng:fix/bedrock-usage-cache-tokens

Conversation

@he-yufeng

@he-yufeng he-yufeng commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Motivation & Context

Several connectors drop token counts that UsageDetails already has canonical fields for, so cache and reasoning usage silently reads as zero for those providers. This PR fixes that for the two connectors that were missing it:

  • Bedrock: the Converse API reports cacheReadInputTokens (input tokens served from a cache) and cacheWriteInputTokens (input tokens written to a cache) when prompt caching is active, but _parse_usage dropped both.
  • Gemini: the response usage metadata reports cached_content_token_count and thoughts_token_count, but _parse_usage dropped both.

UsageDetails already defines cache_read_input_token_count, cache_creation_input_token_count, and reasoning_output_token_count, and the OpenAI/Anthropic connectors already populate the cache fields, so these two connectors were the odd ones out. The change is the same small is not None mapping pattern in each connector's _parse_usage.

Description & Review Guide

  • What are the major changes?
    • BedrockChatClient._parse_usage maps cacheReadInputTokens -> cache_read_input_token_count and cacheWriteInputTokens -> cache_creation_input_token_count.
    • GeminiChatClient._parse_usage maps cached_content_token_count -> cache_read_input_token_count and thoughts_token_count -> reasoning_output_token_count.
    • Bedrock's _parse_usage now returns details or None (matching its UsageDetails | None annotation and the Gemini connector), so a usage payload with no recognized keys no longer propagates an empty mapping.
  • What is the impact of these changes? Cache and reasoning token counts are now reported for Bedrock and Gemini, consistent with the OpenAI/Anthropic connectors. Responses without caching/thinking are unchanged (the fields stay unset).
  • What do you want reviewers to focus on? That the provider field names map to the right UsageDetails keys.

Added focused tests in both packages (test_parse_usage_surfaces_cache_tokens, test_parse_usage_returns_none_when_no_recognized_keys, test_get_response_usage_details_includes_cached_and_reasoning_tokens) and ran both suites locally (Bedrock: 9 passed; Gemini suite green). ruff check and ruff format --check pass on the changed files.

Copilot AI review requested due to automatic review settings June 20, 2026 02:59
@moonbox3 moonbox3 added the python Usage: [Issues, PRs], Target: Python label Jun 20, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Surfaces Bedrock Converse prompt-cache token counts in the framework’s canonical usage_details so cached prompts don’t report zero cache usage, aligning Bedrock behavior with other connectors that already populate these fields.

Changes:

  • Map Bedrock Converse cacheReadInputTokenscache_read_input_token_count and cacheWriteInputTokenscache_creation_input_token_count in BedrockChatClient._parse_usage.
  • Add a Bedrock unit test asserting the cache token counts are surfaced.
  • Also includes Gemini usage parsing + tests for cached/thinking token counts (note: this expands scope beyond the PR title/description).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
python/packages/bedrock/agent_framework_bedrock/_chat_client.py Map Bedrock Converse cache token fields into canonical UsageDetails keys.
python/packages/bedrock/tests/test_bedrock_client.py Add unit test covering Bedrock cache token parsing.
python/packages/gemini/agent_framework_gemini/_chat_client.py Add parsing of Gemini cached/thinking token counts into canonical usage fields.
python/packages/gemini/tests/test_gemini_client.py Extend test helpers and add a test for Gemini cached/reasoning usage fields.

Comment thread python/packages/bedrock/agent_framework_bedrock/_chat_client.py Outdated
Comment thread python/packages/gemini/agent_framework_gemini/_chat_client.py

@eavanvalkenburg eavanvalkenburg left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the comment from copilot

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/bedrock/agent_framework_bedrock
   _chat_client.py4499878%304–305, 321–330, 336, 404, 413, 424, 426, 428, 433, 452–453, 477, 490, 502, 505, 513–514, 517–518, 520–521, 526–528, 530, 540–541, 563, 570, 579–580, 582–583, 585–587, 589, 591–592, 598–600, 603–604, 610–613, 619–629, 632, 651, 656, 706–707, 720, 746, 758, 763, 791, 795–796, 799, 817, 841, 853, 857, 871, 879–880, 884, 886–893
packages/gemini/agent_framework_gemini
   _chat_client.py380399%398, 821, 832
TOTAL42161498188% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
8306 37 💤 0 ❌ 0 🔥 2m 10s ⏱️

@he-yufeng he-yufeng force-pushed the fix/bedrock-usage-cache-tokens branch 4 times, most recently from e361f2d to 21ce782 Compare June 23, 2026 13:37
@he-yufeng he-yufeng force-pushed the fix/bedrock-usage-cache-tokens branch from 21ce782 to 3daf7cc Compare June 24, 2026 01:09
@eavanvalkenburg

Copy link
Copy Markdown
Member

@he-yufeng there are still comments form copilot that need to be addressed or closed with a reason

Matches the UsageDetails | None return annotation and the Gemini
connector's behavior, so a usage payload with no recognized keys no
longer propagates an empty mapping. Adds a regression test.
@he-yufeng he-yufeng changed the title Python: surface Bedrock cache token counts in usage details Python: surface cache and reasoning token counts for the Bedrock and Gemini connectors Jun 24, 2026
@he-yufeng

Copy link
Copy Markdown
Contributor Author

@eavanvalkenburg both Copilot comments are addressed:

  1. Empty-mapping return (Bedrock _parse_usage): fixed in ab5bdf5, it now returns details or None like the Gemini connector, with a regression test.
  2. Scope (the PR also touches Gemini): intentional, both connectors had the same gap. I've updated the title and description to cover Bedrock and Gemini together since it's the same _parse_usage mapping fix. I can split it into two PRs if you'd rather keep one connector per PR.

Both test suites pass locally (Bedrock 9 passed, Gemini green) and ruff is clean on the changed files.

@eavanvalkenburg eavanvalkenburg added this pull request to the merge queue Jun 24, 2026
Merged via the queue into microsoft:main with commit 1df4766 Jun 24, 2026
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Usage: [Issues, PRs], Target: Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants