Skip to content

fix: exclude cache_read tokens from totalTokens count#244

Open
apoorvdarshan wants to merge 6 commits intosteipete:mainfrom
apoorvdarshan:fix/token-count-cache-inflation
Open

fix: exclude cache_read tokens from totalTokens count#244
apoorvdarshan wants to merge 6 commits intosteipete:mainfrom
apoorvdarshan:fix/token-count-cache-inflation

Conversation

@apoorvdarshan
Copy link
Contributor

  • Cache read tokens are served from Anthropic's prompt cache (10% rate).
  • Including them inflated the displayed token count by 10-90x.
  • Fixes API expense incorrect #92

-Cache read tokens are served from Anthropic's prompt cache (10% rate).
-Including them inflated the displayed token count by 10-90x.
-Fixes steipete#92
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f77fe871e6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@apoorvdarshan
Copy link
Contributor Author

I investigated this and found the root cause.

The Bug

In CostUsageScanner+Claude.swift line 541:

let dayTotal = dayInput + dayCacheRead + dayCacheCreate + dayOutput

This sums ALL token types including cache_read_input_tokens. But cache read tokens are served from Anthropic's prompt cache (charged at 10% rate) - they shouldn't count as "processed" tokens.

Example

For a typical cached API response:

  • input_tokens: 10
  • cache_read_input_tokens: 15,000
  • cache_creation_input_tokens: 500
  • output_tokens: 100

Old formula: 10 + 15,000 + 500 + 100 = 15,610 tokens (inflated)
Fixed formula: 10 + 500 + 100 = 610 tokens (accurate)

Fix

// Exclude cache reads - they're served from cache, not re-processed
let dayTotal = dayInput + dayCacheCreate + dayOutput

Verified

Metric Before After
30-day tokens 380M 29M
30-day cost $130 $130 (unchanged)

Cost staying the same confirms pricing logic was already correct - only the token count display was inflated.

I'll submit a PR with this fix.

@apoorvdarshan
Copy link
Contributor Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. 👍

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ratulsarna
Copy link
Collaborator

Thanks for the detailed root-cause writeup. Your before/after numbers are very helpful.

I recommend we preserve totalTokens as canonical provider total and add a separate derived metric for UX (e.g. processedTokens) that excludes cache_read_input_tokens.

Why this direction:

  • It keeps totalTokens aligned with Anthropic usage semantics.
  • It avoids breaking existing expectations for API/CLI/JSON consumers.
  • It still gives us the product-facing number we want to show.

Required follow-ups before merge:

  1. Make tests green (swift test currently fails at CostUsageScannerTests.swift:441, :464, :709).
  2. Implement processedTokens and update UI surfaces that should use it.
  3. Keep totalTokens canonical everywhere (scanner summary, models, CLI payloads).
  4. Update docs/comments accordingly (including the stale comment in CostUsageScanner+Claude.swift).

Once this is in place, we should be merge ready.

Copy link
Collaborator

@ratulsarna ratulsarna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes

Revert totalTokens to include cache_read_input_tokens (Anthropic-aligned).
Add separate processedTokens field (excluding cache_read) for UI display
so users see only freshly computed tokens. Updates menu card, widget, and
fetcher to prefer processedTokens with fallback to totalTokens.

Fixes steipete#92
@apoorvdarshan
Copy link
Contributor Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 598897bb1b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Use processedTokens ?? totalTokens in CostHistoryChartMenuView so the
chart matches the menu summary. Also resolve MiniMax merge conflict.
@apoorvdarshan
Copy link
Contributor Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Delightful!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@apoorvdarshan
Copy link
Contributor Author

@ratulsarna changes done, please review, edit (if required), and merge the same

regards
apoorv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

API expense incorrect

2 participants