feat(tui): show prompt cache hit breakdown in /usage command#231
feat(tui): show prompt cache hit breakdown in /usage command#231shuizhongyueming wants to merge 5 commits into
Conversation
🦋 Changeset detectedLatest commit: aaada08 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
commit: |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds prompt-cache visibility to the /usage TUI report by rendering a per-model cache hit ratio bar plus read/other token breakdown, and updates tests + versioning metadata accordingly.
Changes:
- Render per-model “cache hit” sublines (progress bar + read/other counts) under each model usage line.
- Align model column widths across multi-model sessions (including the “total” row).
- Add/adjust tests and publish a minor changeset for the new
/usageoutput.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| apps/kimi-code/src/tui/components/messages/usage-panel.ts | Adds aligned model rows and a cache hit ratio subline per model in the session usage section. |
| apps/kimi-code/test/tui/components/messages/usage-panel.test.ts | Adds test coverage for new cache sublines (single-model, zero-read, multi-model). |
| .changeset/usage-cache-breakdown.md | Declares a minor release for exposing cache hit stats in /usage. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Compute max model name width for alignment (include "total" for multi-model) | ||
| const maxModelWidth = Math.max( | ||
| ...entries.map(([model]) => model.length), | ||
| entries.length > 1 ? 'total'.length : 0, | ||
| ); |
| const cacheRatio = input > 0 ? usageNumber(row.inputCacheRead) / input : 0; | ||
| const bar = renderProgressBar(cacheRatio, 20); | ||
| const pct = `${(cacheRatio * 100).toFixed(1).replace(/\.0$/, '')}%`; | ||
| lines.push( | ||
| `${cacheIndent}${muted('cache')} ${bar} ${value(pct)} ${muted('hit')} ` + | ||
| `(${value(formatTokenCount(usageNumber(row.inputCacheRead)))} ${muted('read')} ` + | ||
| `· ${value(formatTokenCount(usageNumber(row.inputOther)))} ${muted('other')})`, | ||
| ); |
| // Cache breakdown subline | ||
| const cacheIndent = ' '.repeat(maxModelWidth + 4); // " model " → 2 + maxModelWidth + 2 | ||
| const cacheRatio = input > 0 ? usageNumber(row.inputCacheRead) / input : 0; | ||
| const bar = renderProgressBar(cacheRatio, 20); |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 311f4deb66
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| `(${value(formatTokenCount(usageNumber(row.inputCacheRead)))} ${muted('read')} ` + | ||
| `· ${value(formatTokenCount(usageNumber(row.inputOther)))} ${muted('other')})`, |
There was a problem hiding this comment.
Include cache-creation tokens in the cache breakdown
When a provider reports inputCacheCreation, those input tokens are included in the model's input total and denominator for the hit percentage, but this new subline only prints inputCacheRead and inputOther. In sessions that create prompt-cache entries (for example inputCacheCreation > 0 and little/no inputOther), /usage can show thousands of input tokens while the breakdown says 0 read · 0 other, hiding the cache-write/miss portion users need to understand cache effectiveness. Include cache-creation as its own field or fold it into the non-hit count.
Useful? React with 👍 / 👎.
Related Issue
Resolve #230
Problem
The
/usagecommand only shows total input/output token counts. Users cannot verify prompt cache effectiveness without a breakdown of cache hits vs freshly computed tokens.What changed
Added a cache breakdown subline below each model line in
/usage:Screenshot
Checklist
gen-changesetsskill — added changeset.gen-docsskill — not needed (existing /usage docs cover the command; the breakdown is self-explanatory in-UI).