feat: add average metrics per question to model detail page#27
feat: add average metrics per question to model detail page#27kalyangupta12 wants to merge 2 commits intoappwrite:mainfrom
Conversation
Appwrite ArenaProject ID: Tip Messaging handles push notifications, emails, and SMS through one unified API |
WalkthroughThe Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/routes/model/$modelId.tsx (1)
173-179: Consider using a distinct icon for tool calls.Both "Total tokens" (line 153) and "Total tool calls" use the
Layersicon. This could make it harder for users to visually differentiate between these metrics at a glance. Consider using a different icon for tool calls (e.g.,WrenchorZapfrom lucide-react).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routes/model/`$modelId.tsx around lines 173 - 179, The metric "Total tool calls" is using the same Layers icon as "Total tokens", making them hard to distinguish; update the object pushed into totalItems (the block that checks totalToolCalls > 0 and constructs {icon: Layers, label: "Total tool calls", ...}) to use a distinct icon component (for example Wrench or Zap from lucide-react) instead of Layers and import that icon where icons are imported so the UI shows a different glyph for totalToolCalls.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/routes/model/`$modelId.tsx:
- Around line 234-244: The tooltip is only triggered on hover
(group-hover/tip:opacity-100) so keyboard users can't open it; make the trigger
element (the wrapper span that checks item.tooltip and contains the Info icon)
keyboard-focusable by adding tabIndex={0} and add focus/focus-visible utility
classes to mirror the hover behavior (e.g., group-focus/tip:opacity-100 or
group-focus-visible/tip:opacity-100) so the absolutely positioned tooltip span
becomes visible on focus as well; ensure the trigger retains appropriate ARIA
semantics (role="button" or aria-describedby linking to the tooltip) and keep
the tooltip span pointer-events-none and accessible text content (item.tooltip)
unchanged.
---
Nitpick comments:
In `@src/routes/model/`$modelId.tsx:
- Around line 173-179: The metric "Total tool calls" is using the same Layers
icon as "Total tokens", making them hard to distinguish; update the object
pushed into totalItems (the block that checks totalToolCalls > 0 and constructs
{icon: Layers, label: "Total tool calls", ...}) to use a distinct icon component
(for example Wrench or Zap from lucide-react) instead of Layers and import that
icon where icons are imported so the UI shows a different glyph for
totalToolCalls.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: fc23432f-d836-4e73-820f-a9e5d1fe5672
📒 Files selected for processing (1)
src/routes/model/$modelId.tsx
Greptile SummaryThis PR enhances the model detail page by splitting the existing flat stats grid into two clearly labelled sections — Totals (total tokens, duration, cost, tool calls) and Averages (avg tokens/question, avg duration/question, avg speed, avg cost/question, avg tool calls/question) — and adds a tooltip on token cards that breaks down input vs. output counts. Key observations:
Confidence Score: 4/5
Important Files Changed
Last reviewed commit: "feat: add average me..." |
| if (totalToolCalls > 0) { | ||
| totalItems.push({ | ||
| icon: Layers, | ||
| label: "Total tool calls", | ||
| value: totalToolCalls.toLocaleString(), | ||
| }); |
There was a problem hiding this comment.
Same icon used for tokens and tool calls
Both "Total tokens" (line 153) and "Total tool calls" use Layers as their icon. The same duplication exists in the Averages section for "Avg tokens/question" (line 186) and "Avg tool calls/question" (line 218). Since these metrics represent conceptually distinct things, using the same icon makes the two cards visually indistinguishable at a glance and may confuse users scanning the stats section.
Consider using a more semantically appropriate icon from lucide-react for tool calls — for example Wrench or Hammer — to visually differentiate them from token counts:
import { ..., Wrench } from "lucide-react";
// line 174-178
totalItems.push({
icon: Wrench,
label: "Total tool calls",
value: totalToolCalls.toLocaleString(),
});And similarly at line 217–221:
avgItems.push({
icon: Wrench,
label: "Avg tool calls/question",
value: avgToolCalls.toFixed(1),
});Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| const renderCard = (item: StatCard) => ( | ||
| <div key={item.label} className="arena-card flex flex-col gap-1.5 p-3"> | ||
| <div className="flex items-center gap-1.5 text-[var(--text-secondary)]"> | ||
| <item.icon size={12} /> | ||
| <span className="text-xs">{item.label}</span> | ||
| </div> | ||
| <span className="inline-flex items-center gap-1.5 text-sm font-semibold text-[var(--text-primary)]"> | ||
| {item.value} | ||
| {item.tooltip && ( | ||
| <span className="group/tip relative"> | ||
| <Info | ||
| size={12} | ||
| className="opacity-40 group-hover/tip:opacity-70 transition-opacity cursor-help" | ||
| /> | ||
| <span className="pointer-events-none absolute bottom-full left-1/2 z-50 mb-1.5 -translate-x-1/2 whitespace-pre rounded-md bg-[#1e1e22] px-2.5 py-1.5 text-xs font-normal text-[#EDEDF0] opacity-0 shadow-lg ring-1 ring-white/10 transition-opacity group-hover/tip:opacity-100"> | ||
| {item.tooltip} | ||
| </span> | ||
| </span> | ||
| )} | ||
| </span> | ||
| </div> | ||
| ); |
There was a problem hiding this comment.
renderCard redefined on every render
renderCard is an inline function recreated on every render of ModelStats. While harmless today (the component receives a static model prop), defining it outside the component as a standalone module-level function would be a cleaner pattern. The function does not close over any changing state or props, so it can safely be extracted without any behaviour change.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
- Use Wrench icon for tool calls to distinguish from tokens (Layers) - Make tooltip keyboard accessible with button element and focus states - Add aria-label for screen readers
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/routes/model/$modelId.tsx (1)
167-172: Reuse the shared cost formatter here.
src/components/QuestionCard.tsxalready formats costs viaformatCost, but these new cards hardcode a separatetoFixed(4)/threshold policy. Pulling the same helper intoModelStatswill keep total and per-question cost strings consistent across the detail page.Also applies to: 215-220
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routes/model/`$modelId.tsx around lines 167 - 172, The model total and per-question cost formatting in ModelStats currently uses hardcoded toFixed(4) logic; import and use the shared formatCost helper (the function used by QuestionCard) instead of manual toFixed formatting so cost strings are consistent—replace the `$${model.totalCost.toFixed(4)}` usage in the total cost card and the similar formatting at the per-question section (lines referenced around 215-220) with calls to formatCost(model.totalCost) and formatCost(...) respectively, ensuring you add the import for formatCost at the top of the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/routes/model/`$modelId.tsx:
- Around line 137-145: Compute averages using only entries where
QuestionDetail.toolCallCount is present rather than treating missing as zero:
replace the current sum+null-coalescing logic by (a) totalToolCalls = sum of
q.toolCallCount for q where q.toolCallCount != null, (b) toolCallCountKnown =
count of q where q.toolCallCount != null, and (c) avgToolCalls =
toolCallCountKnown > 0 ? totalToolCalls / toolCallCountKnown : undefined (or
null) so you can render known zeros distinctly; update the rendering to check
toolCallCountKnown (or avg != null) before calling toFixed to avoid hiding valid
zero values. Do the same fix for the other average computations referenced
around the blocks using model.questionDetails at avgToolCalls and the similar
calculations at the later occurrences (the blocks you noted at 174-179 and
223-227).
---
Nitpick comments:
In `@src/routes/model/`$modelId.tsx:
- Around line 167-172: The model total and per-question cost formatting in
ModelStats currently uses hardcoded toFixed(4) logic; import and use the shared
formatCost helper (the function used by QuestionCard) instead of manual toFixed
formatting so cost strings are consistent—replace the
`$${model.totalCost.toFixed(4)}` usage in the total cost card and the similar
formatting at the per-question section (lines referenced around 215-220) with
calls to formatCost(model.totalCost) and formatCost(...) respectively, ensuring
you add the import for formatCost at the top of the file.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c0398844-a937-410e-9c5c-b28836159a38
📒 Files selected for processing (1)
src/routes/model/$modelId.tsx
| // Compute average tool calls | ||
| const totalToolCalls = | ||
| model.questionDetails?.reduce( | ||
| (sum, q) => sum + (q.toolCallCount ?? 0), | ||
| 0, | ||
| ) ?? 0; | ||
| const avgToolCalls = | ||
| model.totalQuestions > 0 ? totalToolCalls / model.totalQuestions : 0; | ||
|
|
There was a problem hiding this comment.
Don't hide valid zero tool-call metrics.
QuestionDetail.toolCallCount is optional, and QuestionCard already treats != null as the “data exists” check. Here, ?? 0 plus the > 0 guards makes “no tool-call data” and “measured zero usage” look identical, and toFixed(1) can still show a non-zero average as 0.0. Track presence separately and render zero/small values when the metric is known.
💡 One way to preserve zero values without treating missing data as zero
+ const hasToolCallData =
+ model.questionDetails?.some((q) => q.toolCallCount != null) ?? false;
const totalToolCalls =
model.questionDetails?.reduce(
(sum, q) => sum + (q.toolCallCount ?? 0),
0,
) ?? 0;
const avgToolCalls =
- model.totalQuestions > 0 ? totalToolCalls / model.totalQuestions : 0;
+ hasToolCallData && model.totalQuestions > 0
+ ? totalToolCalls / model.totalQuestions
+ : 0;
...
- if (totalToolCalls > 0) {
+ if (hasToolCallData) {
totalItems.push({
icon: Wrench,
label: "Total tool calls",
value: totalToolCalls.toLocaleString(),
});
}
...
- if (avgToolCalls > 0) {
+ if (hasToolCallData && model.totalQuestions > 0) {
avgItems.push({
icon: Wrench,
label: "Avg tool calls/question",
- value: avgToolCalls.toFixed(1),
+ value:
+ avgToolCalls === 0
+ ? "0.0"
+ : avgToolCalls < 0.01
+ ? "<0.01"
+ : avgToolCalls < 0.1
+ ? avgToolCalls.toFixed(2)
+ : avgToolCalls.toFixed(1),
});
}Also applies to: 174-179, 223-227
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/routes/model/`$modelId.tsx around lines 137 - 145, Compute averages using
only entries where QuestionDetail.toolCallCount is present rather than treating
missing as zero: replace the current sum+null-coalescing logic by (a)
totalToolCalls = sum of q.toolCallCount for q where q.toolCallCount != null, (b)
toolCallCountKnown = count of q where q.toolCallCount != null, and (c)
avgToolCalls = toolCallCountKnown > 0 ? totalToolCalls / toolCallCountKnown :
undefined (or null) so you can render known zeros distinctly; update the
rendering to check toolCallCountKnown (or avg != null) before calling toFixed to
avoid hiding valid zero values. Do the same fix for the other average
computations referenced around the blocks using model.questionDetails at
avgToolCalls and the similar calculations at the later occurrences (the blocks
you noted at 174-179 and 223-227).

Summary
Related Issue
Addresses enhancement request for more insightful benchmark metrics
Test plan
Summary by CodeRabbit