Skip to content

Release: staging → main#403

Merged
izadoesdev merged 150 commits intomainfrom
staging
Apr 8, 2026
Merged

Release: staging → main#403
izadoesdev merged 150 commits intomainfrom
staging

Conversation

@izadoesdev
Copy link
Copy Markdown
Member

@izadoesdev izadoesdev commented Apr 8, 2026

Summary

150 commits from staging. Main highlights:

  • Insights cockpit redesign (DAT-100, feat(insights): redesign insights as AI cockpit (DAT-100) #402) — new /insights org cockpit built on the existing query pipeline
  • Agent overhaul — cost-aware streaming, thinking effort toggle, credit balance in header, inspectable tool steps, welcome state cleanup, slash command menu restored as prompt shortcuts
  • Billing / autumn config — agent credits enforced via autumn, token+cost telemetry via tokenlens, web_search tool metered, 20% markup, seats removed (unlimited team members), uptime/status pages/alarms gating
  • Tracker fixes — pixel plugin routed to /px.jpg, outgoing-links fixed, flush/unload reliability, strict route allowlist
  • Security — CodeQL high-severity findings closed
  • Uptime — heatmap day buckets aligned to user timezone

Test plan

  • CI green on staging
  • Smoke test /insights cockpit on production
  • Verify agent credit balance + cost telemetry working
  • Verify tracker pixel plugin hitting /px.jpg
  • Spot-check billing autumn features (web_search metering, credit rates)

izadoesdev and others added 30 commits April 3, 2026 19:04
- Fix outgoing links bypassing shouldSkipTracking when disabled/bot
- All plugins (interactions, scroll-depth, errors) now return cleanup
  functions, wired into destroy() via cleanupFns
- Fix HttpClient double-read of response body (use text+parse instead)
- Fix flush race condition: check isFlushing before clearing timer
- destroy() flushes all queues via sendBeacon before clearing
- handlePageUnload uses sendBeacon with fetch fallback for all queues
- databuddyOptIn reinitializes tracker without requiring page reload
- Cache timezone at init instead of creating Intl.DateTimeFormat per event
- Add regression tests for all fixed bugs
Remove unused R2 storage and Logtail env vars.
Upgrade to schema v2, switch to strict envMode with explicit globalEnv,
simplify task configs by removing redundant fields.
Rename typecheck/type-check to check-types across packages, use turbo
for test runner, remove unused root dependencies (opentelemetry, maxmind).
Replace adapter class pattern with plain mapUmamiRow function. Add
createImport helper that handles session exit detection. Remove old
test script, csv-parse/zod/drizzle deps, and utils-map-events.
Align with bun test glob pattern (tests/*.test.ts).
Enable affectedUsingTaskInputs, watchUsingTaskInputs, and
filterUsingTasks to prepare for Turbo 3.0.
- ci.yml: split into 3 parallel jobs (lint, check-types, test), add
  concurrency group, path-ignore for docs, pin bun to 1.3.4, add
  postgres service, remove redundant full build step
- health-check.yml: add concurrency group, restrict triggers to
  Dockerfile and app source changes only
- docker-publish.yml: switch to Blacksmith Docker tools
  (setup-docker-builder, build-push-action, stickydisk), use native
  arm64 runners instead of QEMU emulation, add concurrency group,
  downsize manifest runners to 2vcpu
- codeql.yml: use Blacksmith runner, add staging branch, add
  concurrency group, remove boilerplate
- dependency-review.yml: add staging branch, use Blacksmith runner
- Root check-types now delegates to turbo (packages already use tsc)
- Lint set to continue-on-error until 166 pre-existing errors are fixed
check-types needs package dist outputs to resolve cross-package types
…board

turbo build --filter=@databuddy/api... was resolving to 21 packages
(including dashboard) due to ^build dependency traversal. turbo prune
correctly scopes to only the 14 actual API dependencies.
* chore(notifications): add test scripts to package.json

* test(notifications): add BaseProvider unit tests

* test(notifications): add uptime template tests

* test(notifications): add anomaly template tests
…ion (#375)

* feat(docker): add self-hosting support with docker-compose configuration

* feat(docker): update docker-compose for production readiness and security enhancements

* feat(docker): enhance docker-compose with required APP_URL and GEOIP_DB_URL configurations
- basket: loop HTML tag strip in sanitizeString to defeat stacked-tag
  bypass (js/incomplete-multi-character-sanitization, alert #43)
- tracker: replace Math.random() fallback in generateUUIDv4 with
  crypto.getRandomValues() so downstream IDs are cryptographically
  random (js/insecure-randomness, alerts #59 and #60)
…oute

The outgoing-links plugin has been POSTing to `/outgoing` since 2025-12-27.
basket has no such route — it serves outgoing-link events at `POST /` with
`type: "outgoing_link"` in the body. Every external link click was 404'ing
silently in production.

Confirmed via direct ClickHouse query: zero rows in `analytics.outgoing_links`
for any site that has `trackOutgoingLinks: true` enabled.

Switch the plugin to:
- POST `/` with `type: "outgoing_link"` (the route basket actually serves)
- Send `client_id` as a query param so beacon transport works
  (sendBeacon strips custom headers, including `databuddy-client-id`)
- Include `anonymousId`, `sessionId`, `timestamp` so clicks are attributed
  to a session — basket's insertOutgoingLink reads these but the old
  payload never sent them, so all click rows would have been anonymous
  even if the route had worked

Verified end-to-end against prod basket: probe POST returned 200 success
and the row landed in `analytics.outgoing_links` with the correct
href/text/session_id (the first row ever for a real customer site).
Why this change:
The /outgoing tracker bug fixed in 918d196 hid in green E2E tests for
3+ months because every spec used the same blanket mock pattern:

  await page.route("**/basket.databuddy.cc/*", async (route) => {
    await route.fulfill({ status: 200, ... });
  });

A catch-all that returns 200 for any path. Assertions only checked
tracker-side behaviour (`req.url().includes("/outgoing")`), so the
contract between tracker and basket was never tested.

What this adds:
- BASKET_ROUTES allowlist in tests/test-utils.ts mirroring the routes
  basket actually serves (apps/basket/src/routes/{basket,track,llm}.ts)
- A custom Playwright `test` exported from test-utils with an `auto: true`
  fixture that:
    1. Tracks every basket request via `page.on("request", ...)` so the
       observer fires before any test mock can intercept and shadow it
    2. Default-fulfills known routes 200, unknown routes 404
    3. Throws at teardown if any unknown route was hit, with a clear
       message pointing at BASKET_ROUTES so the next dev knows where
       to update if a new route is genuinely added

Migrating the tests:
- All 16 spec files now `import { test } from "./test-utils"` instead of
  `@playwright/test` — auto-fixtures run for every test, no opt-in
- Deleted ~15 redundant `**/basket.databuddy.cc/*` catch-all beforeEach
  blocks (the fixture handles them)
- Refactored one test in audit-bugs.spec.ts that was using `page.route`
  as a request observer — now uses `page.on("request")` so it doesn't
  shadow the fixture's tracking
- Updated outgoing-links spec predicates from `req.url().includes("/outgoing")`
  to `e.type === "outgoing_link"` (matches the new payload from the plugin
  fix and is the body-shape pattern the hardened mocks expect)
- Marked edge-cases pixel-mode test as `test.fixme()` — the hardening
  immediately uncovered a separate real bug in pixel.ts (it routes
  `/batch`/`/track`/`/vitals`/`/errors` as GET image loads to paths basket
  only serves as POST). Tracked separately, fix is out of scope here.

Verification:
- 126 passed, 6 skipped, 0 failed on chromium
- Temporarily reverted the outgoing-links fix and re-ran the spec: 6 of
  13 outgoing-links tests fail with the exact "Tracker hit unknown basket
  route(s): POST /outgoing" message — proving the regression guard catches
  the bug class it's meant to catch
basket only serves pixel transport at GET /px.jpg — there is no
GET /batch, /track, /vitals, /errors. The pixel plugin only translated
the `/` endpoint to `/px.jpg` and left every other endpoint alone, so
batched screen views, custom track events, vitals and errors all fired
GET image loads to dead routes in pixel mode (`usePixel: true`). The
new strict basket route allowlist in tests/test-utils.ts caught this
the moment the hardening landed.

basket's parsePixelQuery (apps/basket/src/utils/pixel.ts) dispatches on
a `type` query param and only handles `track` / `outgoing_link`. So:

- Always route to `/px.jpg`, never the original endpoint
- Add a `pixelEventTypeFor()` mapping that translates `/`, `/batch`,
  `/track` → `track` and `/outgoing` → `outgoing_link`
- /vitals and /errors return null and silently no-op — they have no
  pixel equivalent and the pre-existing GET /vitals, GET /errors
  behaviour was already broken in production
- Set `type` as a query param so basket's parsePixelQuery dispatches
  correctly (overridable by an explicit `type` field on the data)
- Split batched arrays into one pixel call per event since /px.jpg
  only accepts a single event per GET. Pre-existing behaviour silently
  flattened the array indices into garbage params like `0[name]=...`

Updated the pixel test to assert the request actually lands at /px.jpg
(not just "any GET"), which would have caught this bug if it had been
written that way originally.

Verified: 127 passed, 5 skipped (WebKit only), 0 failed. The pixel test
that was test.fixme'd in the previous commit now passes.
Net -112 LOC across 3 files (101 added, 213 removed).

test-utils.ts (259 → 170):
- Inline setupBasketMock into the auto-fixture; nothing imports it
  externally so the indirection added zero value
- Inline isKnownBasketRoute; replace the regex array with a Set<string>
  keyed on `${method} ${path}` — the routes are exact matches, regex
  was overkill
- Drop UnknownBasketRouteHit and BasketMock interfaces; use inline
  type literals (also unused externally)
- Hoist CORS_HEADERS to module scope, it's static
- Drop the try/catch around `new URL(req.url())` — Playwright requests
  always have valid URLs
- Collapse the duplicate 200/404 fulfill blocks into a single call
- Strip JSDoc walls that just restated function signatures

pixel.ts (143 → 121):
- Replace pixelEventTypeFor() with a PIXEL_TYPE_BY_ENDPOINT lookup
  table — three branches into a four-line const
- Hoist flatten() out of the closure into a module-level
  flattenIntoParams(); doesn't depend on any closure state
- Delete the dead `prefix === "" && key === "properties"` branch — it
  called the same code path as the else branch (when prefix is "",
  newKey === key, so both branches were identical)
- Strip the wall comment explaining the endpoint mapping; the const
  table speaks for itself

edge-cases.spec.ts:
- Drop the wall comment on the pixel test assertion

Verified: 127 passed, 5 skipped, 0 failed. tsc + ultracite clean.
Pulls in tokenlens@1.3.1 — a small registry of LLM model metadata
(context windows + per-token pricing) for the Vercel AI Gateway
catalog. Used by the upcoming agent route telemetry to compute
USD cost per chat turn from result.totalUsage.
Add a small summarizeAgentUsage helper that reads result.totalUsage
from the agent stream and emits raw token counts (input, output,
cache read/write, reasoning) plus best-effort USD cost via the
tokenlens Vercel AI Gateway catalog. Falls back to anthropic/claude-4-sonnet
when the exact gateway model id isn't in the catalog yet — directionally
correct, with cost_fallback flagged so analytics can correct estimated
rows later.

The agent route fires the telemetry as a parallel side effect after
result.consumeStream() — the totalUsage promise resolves once the stream
finishes, so awaiting it never blocks the response. Output goes to:
- mergeWideEvent (evlog wide-event coverage rule)
- trackAgentEvent("agent_activity", { action: "chat_usage", ... })

Failures are captured via captureError and never break the chat flow.

Also export modelNames from ai/config/models so the telemetry helper
can resolve the canonical id without re-declaring it.
- usePendingQueue / useChatLoading: remove JSDoc that just restated
  the function signature.
- useChatLoading JSDoc referenced "post-stream metadata sync" which was
  removed alongside the followup suggestions earlier this session —
  drop it.
- Trim the verbose 4-line "Token + cost telemetry. Fire-and-forget side
  effect..." comment in agent.ts to a single sentence; the helper name
  and Promise.resolve pattern carry the meaning already.
…ated features

Adds full feature coverage to autumn so every billable resource is declared
in one place. Server-side enforcement is wired up in follow-up tickets;
this commit is config-only.

New features:
- monitors  — already declared, now used in main plans (Free 1 / Hobby 5 /
  Pro 25 / Scale 50) plus expanded Pulse counts (Pulse Hobby 25 → was 10,
  Pulse Pro 150 → was 50). Cost driver is checks not monitors, so frequency
  is gated separately.
- uptime_minute_checks (boolean) — gates sub-5-minute granularity. Pro+ on
  main plans, all Pulse plans. Free/Hobby capped at 10-min granularity.
- status_pages (metered) — count cap. Hobby 1 / Pro 3 / Scale 5 / Pulse
  Hobby 3 / Pulse Pro 10.
- status_page_custom_branding (boolean) — paid feature, was free for all.
- status_page_custom_domain (boolean) — Pulse Pro only.
- alarms (metered) — count cap. Hobby 5 / Pro 50 / Scale unlimited /
  Pulse Hobby 25 / Pulse Pro unlimited.
- webhook_alert_destinations (boolean) — Pro+ and all Pulse plans. Free
  and Hobby get email-only.
- funnels, goals, feature_flags, target_groups (metered) — moved from
  shared/features.ts hardcoded limits into autumn so billing is the
  single source of truth.
- retention_analytics, error_tracking (boolean) — Hobby+, was previously
  only enforced in shared/features.ts.

Existing features wired up properly:
- seats — was declared but never attached to any plan. Now Free 2 /
  Hobby 5 / Pro 25 / Scale unlimited.
- rbac — was declared but never attached. Now Pro+ and Scale.
- sso — sso_plan addon now actually has the boolean item.

Plan structure preserved:
- Free $0, Hobby $9.99, Pro $49.99, Scale $99.99 (phasing out — no
  expansion), Pulse Hobby $14.99, Pulse Pro $49.99, SSO addon $100.
- All existing event tier overage pricing kept identical.
- Agent credits + rollover on Pro unchanged from prior commit.
Wire the agent route to autumn for credit enforcement now that the
config is pushed.

- Resolve billing customer id via getBillingCustomerId once user auth
  succeeds. Skipped for API-key flows (no clear billing owner).
- Pre-stream check on agent_credits returns 402 OUT_OF_CREDITS if the
  customer has no remaining balance and no overage allowance.
- Post-stream telemetry side effect now also fires autumn.track for the
  4 metered token features (input/output/cache_read/cache_write). Autumn
  auto-deducts credits via the credit_system creditSchema mapping.
- Track failures captured via captureError, never block the chat flow
  (Promise.allSettled around the 4 tracks).
- Zero-value tracks filtered out so we don't pollute autumn with no-op
  events.
Drops the seats metered feature entirely from autumn.config.ts (was
Free 2 / Hobby 5 / Pro 25 / Scale unlimited) and removes all seats
items from every plan.

Also sets PLAN_FEATURE_LIMITS.TEAM_ROLES to "unlimited" across all
tiers in packages/shared/src/types/features.ts so the dashboard UI
stops gating team size. Upgrade message updated to "Team members are
unlimited on all plans".

Seat-based pricing is off the table for Databuddy.
Add an AgentCreditBalance pill next to the chat history and new-chat
buttons in the agent header. Uses the existing useUsageFeature hook
from billing-provider (which proxies autumn's useCustomer). Shows:

- "234 / 5,000" format with tabular-nums
- Amber warning state at <20% remaining
- Destructive state at 0 remaining
- ∞ when the plan grants unlimited (Scale, comped orgs)
- Click → /billing
- Skeleton during first load
- Auto-refetches 1.5s after stream finish to pick up the post-turn
  autumn.track decrement
Latest day appeared empty on pulse and status pages for users west of
UTC because the client re-parsed backend UTC date strings through
localDayjs, shifting them back a day. Uptime time-series queries also
bucketed in server tz instead of user tz.
Compose `coerceMcpInput` into the schema once with `z.preprocess`
instead of calling it inline at parse time. Drop redundant explanatory
comments now that the wiring is the obvious shape.
Pull in `atmn` so we can `atmn push` autumn config from this repo.
The companion `@useautumn-sdk.d.ts` file is regenerated on every
`atmn pull` and would otherwise drift constantly — gitignore it.
Old schema: 1 credit ≈ \$0.005 of LLM compute. Free tier of 100 credits
covered ~13 first-turn chats — felt stingy. New schema: 1 credit ≈ \$0.001,
free 500, hobby 2.5k, pro 25k (with rollover + rescaled overage tiers).
Same USD margin per plan, ~5x more perceived value.

Also fold the comment blocks into /* */ syntax — atmn pull was
shredding consecutive // lines into single-line paragraphs separated
by blank lines, and biome was happy to keep that mess.
Telemetry: bill fresh input tokens, not the full input count. Cache
read/write were tracked as their own metered features, so the prior
code was charging cached tokens twice (once at the input rate, once
at the cache rate). UsageTelemetry now exposes fresh_input_tokens
from inputTokenDetails.noCacheTokens and the route bills that.

Tools: delete redundant tools that get_data already covers
(execute_query_builder, get_top_pages, get_link, search_links,
get_funnel/goal/annotation_by_id) and trim every remaining tool's
description + parameter describe() calls. The 151-builder list dump
moves out of the schema and into a discoverable error on unknown type.

Prompts: drop CLICKHOUSE_SCHEMA_DOCS from the always-on system prompt
and inline a minimal table hint into execute_sql_query (the only tool
that needs schema knowledge). Condense analytics rules, examples, and
chart guide to the essentials.

Dead code: triage and reflection agents were never invoked from
production — only createAgentConfig("analytics", ...) is called.
Delete the agent files, their prompts, the tool files only they
imported, and collapse createAgentConfig to a passthrough.

Result on a "hi" baseline turn: cache_write 17,322 → 7,683 tokens
(-56%), credits 13.27 → 6.38 (-52%) under the old schema. Stacked
with the credit rescale, free tier delivers ~200 turns vs ~13.

Thinking: user-selectable extended thinking via a new compact control
in the agent input footer. AgentThinking = "off" | "low" | "medium"
| "high" flows from a jotai atom (atomWithStorage so it persists)
through the chat transport into the route body, into AgentContext,
and into Anthropic's thinking.budgetTokens via providerOptions. The
route now drops temperature when thinking is enabled because Anthropic
rejects the combination. createToolLoopAgent actually threads
providerOptions through now — it was dead code before.

Layout fix: agent input footer was shifting on every submit because
KeyboardHints returned null while loading and the Stop button got
inserted between the Thinking control and Send. Hints now swap to
"Generating…" in the same slot, and Send/Stop share one slot that
toggles by state. Footer is pixel-stable across both states.

Probe harness: new apps/api/scripts/agent-cost-probe.ts that runs the
real agent pipeline (same model, same tools, same providerOptions),
supports multi-turn chats and --thinking=off|low|medium|high, and
prints token + credit breakdowns under both schemas.

Cleanup: drop the double cast on apiKey.organizationId, drop the as
LanguageModel cast on models.analytics, drop unused textareaRef in
agent-input, fold get-data's truncation into the mapping pass, remove
a leftover formatLinkForDisplay helper that only existed for the
deleted search_links output.
Back to 1 credit ≈ \$0.005 of compute (input 0.000_6, output 0.003,
cache read 0.000_06, cache write 0.000_75). Plan budgets stay at the
bigger 500/2500/25k numbers — users still get meaningful headroom,
we just no longer subsidize the per-token math on top of that.

Probe now prints runway against the three real plan sizes (free,
hobby, pro) instead of comparing against a theoretical proposed
schema.
Items and sections inside the sidebar nav memo were filtered by
getFlag() directly, with no isHydrated guard. The FlagsProvider reads
from localStorage synchronously on the first client render, so the
flag store was populated on the client but empty on the server. Any
flag-gated nav item (e.g. Home > Insights) was rendered on the client
and absent on the server, producing React error #418 hydration
mismatches on every page sharing the (main) layout: /websites,
/onboarding, /demo/*, /status/*, and nested website routes.

Mirror the existing filterCategoriesByFlags pattern: treat all flags
as off until isHydrated is true. Server and first client paint now
agree, and flag-gated items appear on the next render once hydration
completes.
Thinking picker moves from Popover + button grid to DropdownMenu
+ DropdownMenuRadioGroup so keyboard nav + selected state come from
the primitive. Header drops the inline favicon/domain and adds a
thin separator before the right-side action cluster.
* chore: ignore .superpowers brainstorm session dir

* feat(dashboard/insights): add compact variant to InsightCard

DAT-100

* perf(dashboard/insights): skip action-row memos in compact InsightCard

Avoids crypto.randomUUID() and prompt building on every render of
compact rows where the action row never mounts.

DAT-100

* refactor(dashboard/home): reuse shared InsightCard compact variant

Removes local InsightRow and buildDiagnosticPrompt duplication
in smart-insights-section.tsx.

DAT-100

* fix(dashboard/home): drop divide-y that doubled InsightCard borders

InsightCard carries its own border-b per row. Keeping divide-y on the
scroll container rendered visible double lines in the home sidebar.

DAT-100

* refactor(dashboard): unify insights hook via useInsightsFeed

useSmartInsights becomes a thin wrapper that slices to 20. Both pages
now share the same underlying queries and cache keys.

DAT-100

* chore(dashboard/insights): drop orphaned history cache write on clear

The history query key is no longer read after the hook unification
in the previous commit. Only the historyInfinite + ai keys need
optimistic empty payloads.

DAT-100

* test(api/insights): add failing test for aggregateKpiSeries

TDD first half of the org KPI aggregation. Implementation lands in
the next commit.

DAT-100

* feat(dashboard/agent): animated unicode spinners with rotating thinking phrases

Replace phosphor icon spinners in the agent chat with a UnicodeSpinner
component driving braille/CLI animations (dots, dots2, orbit, breathe,
snake, columns, helix, diagswipe, fillsweep, line). Frames pre-baked
from gunnargray-dev/unicode-animations (MIT).

Tool step, streaming indicator, reasoning trigger, and generating hint
each roll a random variant per turn via useRandomThinkingVariant, and
the streaming/reasoning labels cycle bunny-themed phrases via
useThinkingPhrase so every new turn feels distinct.

Font is forced to a system monospace stack because LT Superior Mono
lacks the Unicode Braille block and silently falls back to a
proportional face.

* feat(api/insights): add org KPI aggregation over ClickHouse

Pure aggregateKpiSeries passes 8 test cases covering sum, bounce
percent, visitor-weighted LCP, zero-fill, and signed change.
fetchOrgKpis runs a single ClickHouse query over analytics.events
joined per day per website, then fans out to 5 metrics.

Vitals LCP and errors are stubbed to 0 with follow-up TODOs.

DAT-100

* fix(dashboard/ui): restore secondary button variant

The secondary variant was accidentally removed in 6b31774, leaving
every consumer (Downgrade button in the pricing table, plus many other
Button variant="secondary" sites) rendering with no background. Restore
it using the same token stack (bg-secondary / hover:bg-secondary-brighter)
that was there before.

* feat(dashboard/nav): ungate insights and agent nav items

Remove the insights flag from the Insights home nav item and drop the
WIP tag + agent flag from the AI Agent website nav item so both surface
to all users. Also consolidates the phosphor icon imports into a single
line.

* fix(api/insights): tighten bounce denominator and session filter

- session_stats.sessions now uses countIf(page_count >= 1) to match
  the canonical per-site bounce formula in summary.ts
- page_agg now filters session_id != '' consistent with session_agg

DAT-100

* test(api/insights): add failing test for mergeDimensionRows

DAT-100

* feat(api/insights): add org dimension aggregation

mergeDimensionRows groups by key, sums current/previous, computes
signed change, sorts desc, and slices. fetchOrgDimensions runs a
two-window ClickHouse query over analytics.events for country/page/
referrer dimensions.

DAT-100

* fix(api/insights): normalize page and referrer dimension keys

Page kind now applies the same trimRight(path(path), '/') normalization
used by per-site top_pages queries. Referrer kind uses the canonical
domain-collapsing expression from expressions.ts. Without these fixes,
the org-wide dimension tile would show raw high-cardinality URL strings
instead of useful groupings.

Also makes mergeDimensionRows' limit parameter optional (default 5) per
the module spec.

DAT-100

* fix(api/insights): include homepage in top pages dimension

Previous fix commit dropped the root path from the ranking to avoid
homepage traffic dominating; that is actually the signal we want to
surface on the cockpit. The empty-string filter is enough.

DAT-100

* feat(dashboard/agent): cycle thinking effort on click instead of dropdown

Replace the thinking-effort dropdown with a single button that cycles
off → low → medium → high on each click. Description text moves to a
hover Tooltip so no info is lost. Drops CaretDownIcon and all
DropdownMenu* imports.

* polish(dashboard/agent): drop brain icon from resolved reasoning header

Once streaming finishes and the reasoning block collapses to "Thought
for Xs", the decorative brain icon added visual noise without conveying
anything the label didn't already say.

* chore(rpc/insights): move org-kpis and org-dimensions to rpc/lib

packages/rpc cannot import from apps/api, so the B4 RPC procedure
needs these helpers in the rpc package. Inlines the two Expressions
constants used by org-dimensions to avoid cross-package coupling on
apps/api's query builder internals.

DAT-100

* feat(rpc/insights): add orgSummary procedure with KPI cache

sessionProcedure + org guard + 5-min cacheable wrapper around
fetchOrgKpis. Lists org websites via Drizzle, returns zero-filled
summaries when the org has no sites. Site health is stubbed to
'healthy' with a follow-up TODO.

DAT-100

* feat(rpc/insights): add orgDimensions procedure

Mirrors orgSummary's shape: sessionProcedure + org guard + 5-min
cacheable. Calls fetchOrgDimensions in parallel for country / page /
referrer kinds. Limit defaults to 5, capped at 20.

Also extracts listOrgWebsites as a top-level helper so both
orgSummary and orgDimensions share the same DB query.

DAT-100

* chore(rpc): export org-kpis helpers for apps/api consumption

The new Elysia narrative route in apps/api needs fetchOrgKpis.

DAT-100

* feat(api/insights): add /v1/insights/org-narrative Elysia route

GET returns a 2-3 sentence AI-generated executive summary of an
organization's state over 7d/30d/90d, built from fetchOrgKpis +
top insights. 1h Redis cache per (org, range). Rate limited 30/h
per (org, user). Uses models.triage for the LLM call.

DAT-100

* feat(dashboard/insights): add org summary / dimensions / narrative hooks

Three TanStack Query hooks wrapping the new rpc procedures
(orgSummary, orgDimensions) and the Elysia narrative endpoint.
fetchInsightsOrgNarrative added to the shared insight-api lib.

DAT-100

* feat(dashboard/insights): add time range atom + selector

7d/30d/90d segmented selector persisted via atomWithStorage.
Unifies the TimeRange type across the org hooks.

DAT-100

* feat(dashboard/insights): add cockpit narrative component

2-3 sentence AI summary at the top of /insights. Wires
useOrgNarrative, renders loading / error / content states, and
shows a relative "Updated N minutes ago" footer.

DAT-100

* feat(dashboard/insights): add KPI row with 5 StatCards

Visitors, Sessions, Bounce, Errors, LCP p75 wired to useOrgSummary.
Reuses the existing StatCard component with sparklines and
invertTrend for the three "lower is better" metrics.

DAT-100

* fix(dashboard/agent): avoid duplicate React keys from empty message ids

Some streaming states produce messages with an empty id, and
renderMessagePart composes inner keys as `${messageId}-${partIndex}`.
When two empty-id messages exist, their parts collide on "-0", "-1",
etc. and React throws the duplicate-key warning. Hoist a
`messageKey = message.id || \`msg-\${index}\`` once per message and use
it for both the outer <Message key> and the inner part-key base.

* feat(dashboard/agent): restore slash command menu as prompt shortcuts

Bring back the /analyze, /sources, /funnel, /pages, /live, /anomalies,
/compare, /report command palette. When the input starts with "/", a
Popover anchored to the textarea width lists matching commands with
icon + title + description. Selecting one fills the textarea with the
command's full prompt template so the user can edit or send it.

Pure cosmetic shortcut — no tool hints, no toolChoice wiring, every
command still goes through the normal sendMessage path. Keyboard nav
(arrow keys / enter / tab / escape) is driven from the textarea handler
so focus never leaves the input. Clicking a row uses mousedown
preventDefault to avoid stealing focus either.

* feat(dashboard/insights): add site health grid

Per-site tiles with favicon, name, health badge (healthy /
attention / degraded), and a one-line reason from the most severe
insight. deriveSiteHealth in lib/site-health.ts is a pure function
operating on the existing insights feed.

DAT-100

* refactor(dashboard/insights): extract CockpitSignals from page content

Pure extraction to prepare for the cockpit compose step. Filter bar,
sort, dismiss/feedback state, and signals list now live in
cockpit-signals.tsx. InsightsPageContent keeps the header and
clear-all dialog wiring. No behavior change.

DAT-100

* feat(dashboard/insights): add generic DimensionTile

Shows a title, icon, and up to N rows with value + signed change
chip. Used by the cockpit for Countries / Pages / Referrers.

DAT-100

* feat(dashboard/insights): compose cockpit layout

Stacks narrative, KPI row, site health grid, signals feed, and
dimension tiles. TimeRangeSelector lives in the PageHeader right
slot. Empty-org state renders an add-website CTA in place of the
cockpit when no sites exist.

DAT-100

* polish(dashboard/insights): a11y, error states, phosphor imports

- aria-busy on cockpit scroll container
- error states + retry in SiteHealthGrid and DimensionTile
- aria-label on CockpitSignals section
- focus-visible rings on SegmentedControl, filter buttons, retry links
- phosphor icons use /dist/ssr subpath per project convention
- skeleton count in SiteHealthGrid matches real site count

DAT-100

* fix(billing): bump cache-write credit cost to match 1h cache rate

The agent pins ANTHROPIC_CACHE_1H on user turns, which bills cache
writes at $6/M, not the $3.75/M 5-minute rate the config assumed.
Under-charged by ~32% on cached turns.

Verified against 133 live Sonnet 4.6 gateway rows
(gateway-inference-requests.csv): predicted spend with the new rate
matches billed spend to within rounding ($5.8122 vs $5.81224).

Plan allocations unchanged — free still gets 500, hobby 2500, pro
25k. Only the per-token credit conversion is corrected.

- cache_write: 0.000_75 → 0.001_2 credits/token
- agent-cost-probe.ts mirror updated to match

* fix(insights): wire errors+LCP data, narrative fallback, redesign tiles

- fetchOrgKpis now queries analytics.error_spans and
  analytics.web_vitals_spans in parallel (removes stubs)
- org-narrative falls back to a deterministic summary when the LLM
  returns empty and switches model from triage to analytics for
  reliability
- SiteHealthGrid: 3-col grid, bigger tiles, shows total views + trend
  + live users + health pill with reason
- DimensionTile: table-style rows with progress-bar backgrounds,
  country tile renders flag emojis from country codes

DAT-100

* feat(rpc/insights): add social referrals query and reshape (DAT-100)

* fix(billing): apply 20% markup to agent credit rates

Every creditCost multiplied by 1.20 so 1 sold credit represents
$0.004167 of LLM compute instead of $0.005 at cost. Plan allocations
(free 500, hobby 2500, pro 25k, scale 25k) unchanged — users silently
burn credits ~20% faster, which moves hobby and pro toward breakeven
on cached turns and makes the pro overage tiers ($0.0012, $0.001,
$0.0008) no longer catastrophic on a $/$ basis.

- input:       0.0006   → 0.000_72
- output:      0.003    → 0.003_6
- cache_read:  0.000_06 → 0.000_072
- cache_write: 0.001_2  → 0.001_44

Narrative comment block dropped — the numbers speak for themselves.

* feat(billing): meter web_search tool usage as agent credits

Perplexity sonar-pro calls from the web_search tool were completely
invisible to the credit system — a single search costs us ~$0.0145
(verified from gateway-inference-requests.csv) but burned zero user
credits. Spamming /search via the agent was effectively free.

Add a new agent_web_search_calls metered feature charged at 5 credits
per call (~$0.021 at post-markup $0.00417/credit, covers the observed
cost with headroom for variance). billingCustomerId now flows through
AgentContext → AppContext → experimental_context so the tool can
autumn.track on success. Fire-and-forget with error logging so a
tracking failure never blocks the agent's response.

Title generation (gpt-oss-120b, ~$0.0002/call via cerebras) stays
untracked — the code complexity isn't worth the pennies.

* feat(rpc/insights): expose orgSocialReferrals procedure (DAT-100)

* feat(dashboard/insights): add useSocialReferrals hook (DAT-100)

* feat(dashboard/insights): add SocialPlatformRow component (DAT-100)

* feat(dashboard/insights): add SocialReferrals section (DAT-100)

* feat(dashboard/insights): mount SocialReferrals on insights page (DAT-100)

* refactor(insights): rewrite cockpit to use existing query pipeline

Swaps the custom parallel stack for the same useBatchDynamicQuery +
StatCard + DataTable primitives the per-site overview tab uses.

- insights-page-content.tsx: fetches summary_metrics, events_by_date,
  top_pages, top_referrers, country via useBatchDynamicQuery scoped
  to a focus site (picker in the header for multi-site orgs).
  Renders StatCards with sparklines and DataTables with existing
  createPageColumns / createReferrerColumns / createGeoColumns.
- routers/insights.ts: remove orgSummary and orgDimensions RPC
  procedures (and their helpers). orgSocialReferrals stays.
- index.ts: drop fetchOrgKpis / aggregateKpiSeries exports.
- routes/insights.ts: narrative route no longer depends on
  fetchOrgKpis. It summarises the top 5 stored insights directly
  and falls back to a deterministic headline when the LLM returns
  empty.
- lib/insight-api.ts: OrgNarrativeResponse drops deltas field.

DAT-100

* revert(insights): remove social referrals feature (DAT-100)

Reverts the SocialReferrals section and its backend plumbing. The
design direction for /insights is moving toward the overview-tab
card layout, where a bespoke collapsible social-platform section
does not fit.

- Delete SocialReferrals + SocialPlatformRow components
- Delete useSocialReferrals hook
- Delete social-referrals backend lib + tests
- Drop orgSocialReferrals RPC procedure
- Unmount from insights-page-content

* refactor(insights): unify cockpit layout with overview-tab pattern (DAT-100)

The cockpit children (CockpitNarrative, CockpitSignals, InsightCard)
were designed for a full-width page with no outer padding, each
section owning its own 'border-b px-4 sm:px-6'. After the recent
rewrite the page uses the overview-tab pattern (outer padded
container + card-shaped children), so the cockpit children were
double-padded and stranded borders floated in the middle of the
flex container.

- Outer container: 'flex flex-col gap-3 p-3 sm:gap-4 sm:p-4 lg:p-6'
  -> 'space-y-3 p-4 sm:space-y-4' (matches overview-tab)
- CockpitNarrative: bordered section w/ gradient -> card with
  'overflow-hidden rounded border bg-card' + TableToolbar-style
  header (sparkle + title + 'Updated ago' meta on the right)
- CockpitSignals: wrap in card shell + add header row matching
  SmartInsightsSection ('Signals' + visible/total count). Drop
  sm:px-6 from filter bar and InsightsFetchStatusRow so content
  aligns with the card insets.
- InsightCard: drop sm:px-6 so insight rows align with the new card
  header padding. Add 'last:border-b-0' so the final row visually
  merges with the card's outer border instead of doubling it.

* Update navigation-config.tsx
Reconciles staging with the squashed Release #383 on main so the next
staging → main PR starts from a clean base. All 17 conflicts (13 content,
2 add/add, 2 modify/delete) resolved by taking staging's version — main
had no post-release edits, only the stale squash snapshot.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 8, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3f054617-1225-4fe4-a069-e4e69cd302de

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch staging

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
dashboard (staging) Skipped Skipped Apr 8, 2026 6:24pm
documentation (staging) Skipped Skipped Apr 8, 2026 6:24pm

@izadoesdev izadoesdev merged commit 06045da into main Apr 8, 2026
18 of 19 checks passed
Copy link
Copy Markdown
Member Author

@izadoesdev izadoesdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated First-Pass Review

What this PR does: Release PR merging 150 staging commits into main. Major features include the new /insights org cockpit with AI-generated narratives, an agent billing overhaul (cost-aware streaming, Autumn credit enforcement, web_search metering with 20% markup), tracker reliability fixes, and a significant frontend refactor that consolidates the insights UI into reusable components (InsightCard with compact/full variants, CockpitSignals, CockpitNarrative).

PR description: Thorough and well-structured. The squash-merge warning is especially helpful given the Release #383 conflict history.


Issues found

🟡 Narrative DB query ignores range parameter — The /org-narrative endpoint accepts a 7d/30d/90d range and tells the LLM to summarize "the last X," but the underlying query fetches top-5 insights by priority with no date filter. A "this week" narrative could surface month-old insights. See inline comment.

🟡 Cost probe lost its cross-reference comment — The agent-cost-probe.ts script had a comment linking it to autumn.config.ts. Removing it makes it easier for the two configs to drift out of sync. See inline comment.

🔵 Fire-and-forget billing tracking — Web search metering .catch()es silently. Fine for availability, but adding a tracing breadcrumb would help detect billing drift. See inline comment.

🔵 Unsafe type casts in cockpit pageinsights-page-content.tsx uses several as unknown as ColumnDef<...> casts. Not urgent, but worth cleaning up to catch type regressions early.

Flags

High-impact: Billing & metering changes — Agent credit schema, web_search metering, Autumn enforcement. Verify production autumn.config.ts matches the updated rates here.

High-impact: Analytics ingestion path — The narrative endpoint runs LLM inference in the request path (cached, rate-limited, with deterministic fallback — all good). Just confirm the cache key correctly incorporates both organizationId and range.


Overall: Needs manual review

The architecture is solid — good use of caching, rate limiting, auth checks, and graceful LLM fallbacks. The frontend refactor is clean DRY work. The narrative date-range issue is the main thing worth fixing before merge. Everything else is minor. Reminder: merge commit, don't squash.

.select({
title: analyticsInsights.title,
description: analyticsInsights.description,
severity: analyticsInsights.severity,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Warning: DB query doesn't filter by range, so the narrative may be misleading.

generateNarrativeCached receives the range parameter (7d/30d/90d) and passes it to the LLM prompt ("summarize the last 7d"), but the actual DB query pulls the top 5 insights by priority for the org with no date filter. This means a "this week" narrative could surface stale insights from months ago.

Consider adding a createdAt filter to scope the query to the requested range:

const cutoff = dayjs().subtract(
  range === "7d" ? 7 : range === "30d" ? 30 : 90,
  "day"
).toDate();

// ...
.where(
  and(
    eq(analyticsInsights.organizationId, organizationId),
    gte(analyticsInsights.createdAt, cutoff)
  )
)

logger.error("Failed to track web search usage", {
error:
trackError instanceof Error
? trackError.message
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Suggestion: Fire-and-forget billing could silently under-count usage.

If the Autumn billing service has a prolonged outage, web searches continue unmetered since .catch() only logs. This is probably the right tradeoff for availability, but consider adding a counter/metric for failed tracking calls so you can detect drift between actual usage and billed usage. Even a simple mergeWideEvent({ web_search_billing_track_failed: true }) would help.

output: 0.003,
cacheRead: 0.000_06,
cacheWrite: 0.000_75,
input: 0.000_72,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Warning: CURRENT_SCHEMA comment referencing autumn.config.ts was removed.

The old code had a helpful comment (// Matches creditSchema in apps/dashboard/autumn.config.ts) that tied this probe script to the production pricing config. With the 20% markup applied here, it'd be worth keeping a similar breadcrumb so future edits stay in sync. If someone updates autumn.config.ts later but forgets this file (or vice versa), the probe results will be wrong.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 8, 2026

Greptile Summary

This is a large staging→main release covering the new /insights org cockpit (narrative + signals panels, stat cards, focused-site picker), agent overhaul (slash-command menu, thinking-effort cycle button, unicode spinner), billing updates (20% credit markup, agent_web_search_calls metered via Autumn), and tracker/security fixes.

  • P1 — generateNarrativeCached ignores range in the DB query: the WHERE clause only filters by organizationId; no createdAt ≥ cutoff guard is applied. All three range selections (7d / 30d / 90d) feed the LLM the same all-time top-5 signals and only differ in the time-label injected into the prompt, producing misleading "this week" vs "this quarter" narratives from identical data.

Confidence Score: 4/5

Safe to merge after fixing the narrative date-range filter; all other changes are clean.

One P1 correctness bug in generateNarrativeCached — the range selector silently has no effect on which insights the LLM summarises, producing misleading output for the 30d and 90d views. All other changes (billing, agent UX, cockpit components) are well-structured with no additional blockers.

apps/api/src/routes/insights.ts — generateNarrativeCached DB query missing range-based createdAt filter.

Vulnerabilities

No security concerns identified. The new /org-narrative endpoint validates org membership before serving data, applies per-user rate limiting, and caches results server-side. Web-search billing tracking uses fire-and-forget with no sensitive data leakage.

Important Files Changed

Filename Overview
apps/api/src/routes/insights.ts New /org-narrative endpoint with rate-limiting and caching; P1 bug: generateNarrativeCached DB query never filters by the range parameter, so narratives for all three ranges are built from the same all-time top-priority data.
apps/api/src/ai/tools/web-search.ts Adds Autumn metered tracking for agent_web_search_calls after successful searches; fire-and-forget pattern with error logging is appropriate.
apps/dashboard/autumn.config.ts Credit costs updated to 20% markup; agent_web_search_calls metered feature added at 5 credits per call; cost-probe script updated to match.
apps/dashboard/app/(main)/insights/_components/cockpit-narrative.tsx New component rendering the org-level AI narrative; P2 issue: aria-label is hardcoded "Weekly summary" regardless of selected range.
apps/dashboard/app/(main)/insights/_components/cockpit-signals.tsx New signals panel extracted from the old page content; filter/sort/dismiss/pagination logic is clean and well-separated from the parent layout.
apps/dashboard/app/(main)/insights/_components/insights-page-content.tsx Major redesign into a cockpit layout with stat cards, mini-charts, and focused-site selector; P2: queries useMemo dep only on granularity (always "daily"), not the full dateRange.
apps/dashboard/app/(main)/websites/[id]/agent/_components/agent-input.tsx Replaces thinking dropdown with a single-click cycle button; adds slash-command menu wired to keyboard navigation; layout unchanged.
apps/dashboard/lib/insight-api.ts Adds fetchInsightsOrgNarrative fetch helper and OrgNarrativeResponse discriminated union; correct use of credentials: "include" and AbortSignal.timeout.

Sequence Diagram

sequenceDiagram
    participant UI as Dashboard (CockpitNarrative)
    participant Hook as useOrgNarrative
    participant API as GET /v1/insights/org-narrative
    participant RL as Rate Limiter
    participant Cache as Redis Cache
    participant DB as PostgreSQL (analyticsInsights)
    participant LLM as LLM (generateText)

    UI->>Hook: range changes (7d/30d/90d)
    Hook->>API: fetch ?organizationId=X&range=7d
    API->>API: verify org membership
    API->>RL: rateLimit(org:user, 30/hr)
    RL-->>API: allowed
    API->>Cache: get insights-narrative:X:7d
    alt Cache HIT
        Cache-->>API: cached narrative
    else Cache MISS
        API->>DB: SELECT top-5 by priority WHERE orgId=X
        Note over DB: Missing gte(createdAt, cutoff)
        DB-->>API: same all-time top-5 regardless of range
        API->>LLM: prompt with time label + insights
        LLM-->>API: narrative text
        API->>Cache: set TTL 1h
    end
    API-->>Hook: success narrative generatedAt
    Hook-->>UI: render narrative
Loading

Reviews (1): Last reviewed commit: "chore: merge main into staging" | Re-trigger Greptile

Comment on lines +966 to +979

const generateNarrativeCached = cacheable(
async function generateNarrativeCached(
organizationId: string,
range: "7d" | "30d" | "90d"
): Promise<{ narrative: string }> {
const topInsights = await db
.select({
title: analyticsInsights.title,
description: analyticsInsights.description,
severity: analyticsInsights.severity,
changePercent: analyticsInsights.changePercent,
websiteName: websites.name,
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Range parameter not applied to DB query

generateNarrativeCached accepts range and injects it into the LLM prompt's time context ("over the last 7d/30d/90d"), but the Drizzle query has no gte(analyticsInsights.createdAt, cutoff) filter — it always fetches the top-5 by priority from all-time data. Every range selection returns the same underlying insights, producing narratives that say "this quarter" vs "this week" about the exact same signals. Every other date-scoped query in this file (e.g. the main feed at line 862, the recency check at line 233) adds a cutoff guard; this one is missing it.

// before the `.limit(...)` call, compute and apply a cutoff:
const days = range === "7d" ? 7 : range === "30d" ? 30 : 90;
const cutoff = dayjs().subtract(days, "day").toDate();
.where(
  and(
    eq(analyticsInsights.organizationId, organizationId),
    gte(analyticsInsights.createdAt, cutoff),
  )
)

return (
<section
aria-label="Weekly summary"
className="overflow-hidden rounded border bg-card"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 aria-label hardcoded to "Weekly summary"

The section's accessible label is "Weekly summary" regardless of the selected range — users on a 30d or 90d view will hear "Weekly summary" from screen readers. Use the same rangeLabel helper already defined in this file.

Comment on lines +183 to +213
if (focusSiteId && websites.some((w) => w.id === focusSiteId)) {
return focusSiteId;
}
return websites[0].id;
}, [focusSiteId, websites]);

const dateRange = useMemo(() => rangeToDateRange(range), [range]);

const queries = useMemo(
() => [
{
id: "cockpit-summary",
parameters: ["summary_metrics", "events_by_date"],
limit: 100,
granularity: dateRange.granularity,
},
{
id: "cockpit-pages",
parameters: ["top_pages"],
limit: 8,
granularity: dateRange.granularity,
},
{
id: "cockpit-referrers",
parameters: ["top_referrers"],
limit: 8,
granularity: dateRange.granularity,
},
{
id: "cockpit-geo",
parameters: ["country"],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 queries memo only tracks granularity, not the full date range

The queries array is memoized on dateRange.granularity alone — but granularity is always "daily" regardless of the selected range. Adding dateRange to the deps avoids the subtle mismatch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants