Summary
The assistant.usage event defines cacheReadTokens and cacheWriteTokens fields (documented in SDK docs/features/streaming-events.md:309-310), but these values are never populated — they always arrive as 0.0 regardless of whether the LLM provider returns cache metrics.
Both Anthropic and OpenAI APIs return prompt caching metrics, but the CLI doesn't extract and map them.
Repro (confirmed 2026-04-13)
SDK v0.2.2, model: claude-sonnet-4-20250514 via BYOK (CAPI relay)
Collected 3 assistant.usage events:
Message 1: input=44.0 output=4.0 cacheReadTokens=0.0 cacheWriteTokens=0.0
Message 2: input=89.0 output=4.0 cacheReadTokens=0.0 cacheWriteTokens=0.0
Message 3: input=134.0 output=4.0 cacheReadTokens=0.0 cacheWriteTokens=0.0
All calls return 0.0 even though Anthropic returns cache_read_input_tokens / cache_creation_input_tokens in the API response.
Provider response formats (not being extracted by CLI)
Anthropic returns: usage.cache_read_input_tokens, usage.cache_creation_input_tokens
OpenAI returns: usage.prompt_tokens_details.cached_tokens
SDK code references
| File |
Line |
What |
python/copilot/generated/session_events.py |
798, 801 |
cache_read_tokens / cache_write_tokens fields defined |
python/copilot/generated/session_events.py |
2328, 2331 |
Per-call Data class fields |
go/generated_session_events.go |
1084-1086 |
Go equivalents |
nodejs/src/generated/session-events.ts |
1586, 1590 |
TS equivalents |
docs/features/streaming-events.md |
309-310 |
Documented as optional fields |
Impact on Dracarys
- No visibility into prompt cache hit rates in our Kusto telemetry
- Cannot optimize system prompts for cacheability
- Cannot reconcile costs (Anthropic cache reads are ~90% cheaper)
Summary
The
assistant.usageevent definescacheReadTokensandcacheWriteTokensfields (documented in SDKdocs/features/streaming-events.md:309-310), but these values are never populated — they always arrive as0.0regardless of whether the LLM provider returns cache metrics.Both Anthropic and OpenAI APIs return prompt caching metrics, but the CLI doesn't extract and map them.
Repro (confirmed 2026-04-13)
SDK
v0.2.2, model:claude-sonnet-4-20250514via BYOK (CAPI relay)All calls return
0.0even though Anthropic returnscache_read_input_tokens/cache_creation_input_tokensin the API response.Provider response formats (not being extracted by CLI)
Anthropic returns:
usage.cache_read_input_tokens,usage.cache_creation_input_tokensOpenAI returns:
usage.prompt_tokens_details.cached_tokensSDK code references
python/copilot/generated/session_events.pycache_read_tokens/cache_write_tokensfields definedpython/copilot/generated/session_events.pyDataclass fieldsgo/generated_session_events.gonodejs/src/generated/session-events.tsdocs/features/streaming-events.mdImpact on Dracarys