Skip to content

[P3] BaseDataService: sanctioned extension point for read-only QueryClient observability #8552

@MajorLift

Description

@MajorLift

Parent Epic: None (standalone)
File: packages/base-data-service/src/BaseDataService.ts
Size: S | Hours: 2–4


Problem

Option A's proxy model (TanStack Query QueryClient in the background process, proxy queryFn in the UI) intentionally hides the cache, dedupe, retry, fetch, and sync chain from UI-perceived latency tracing. The distributed-tracing wrappers in metamask-extension#39891 give us the call boundaries (useQuery → submitRequestToBackground → messenger → BaseDataService action → response). Inside the opaque region, every interesting decision is invisible:

  • Cache hit vs. miss. Most data service calls don't make a network request. Auto-instrumented fetch spans see zero. The query looks "slow" with no attribution.
  • In-flight dedupe. N concurrent observers collapse to one fetch. The fetch span doesn't know N.
  • Cockatiel retry sequences. A single logical "fetch this" can produce 0, 1, or many HTTP requests. Auto-spans see them as unrelated. Backoff intervals are invisible.
  • Circuit breaker state. When open, no fetch happens. The query fails or returns cached data with no auto-span at all.
  • Cache sync to UI. setData → cacheUpdated event → Port.postMessage → UI hydrate is the post-fetch synchronization cost. None of it surfaces as a span.
  • Trigger attribution. Poll interval, user invalidation, observer subscribe, and WebSocket invalidation are all identical from the network's view.

Three Option A tradeoffs documented in decisions#131 currently depend on observability that doesn't exist:

  • Per-screen staleTime. Validating that the service-side workarounds hold in production requires distinguishing fetches from cache serves.
  • refetchOnMount / refetchOnWindowFocus collapse. Measuring whether concurrent queries amplify messenger volume or are absorbed by background staleTime.
  • Split-process staleness indirection. Correlating UI proxy calls with background fetches.

Without this visibility, the tradeoffs are theoretical. With it, they're verifiable.


Solution

Add a sanctioned extension point to BaseDataService that exposes the internal QueryClient for read-only observability subscriptions. Two variants; preference is maintainer's call:

  1. Public getter. get queryClient(): QueryClient
  2. Constructor callback. onCreate?: (qc: QueryClient) => void

Either is ~3 lines in core. Downstream clients attach their own instrumentation externally (e.g. qc.getQueryCache().subscribe(emitSpan) plus Cockatiel onRetry / onBreak handlers in extension's Sentry init, ~50 lines). The full span chain becomes stitchable end-to-end:

useQuery → submitRequestToBackground (span) → messenger (span) →
  [BaseDataService.fetchQuery — cache miss, observerAdded → fetching →
   Cockatiel attempt 1 → fetch (auto, child of rpc.handler) → 503 →
   backoff 200ms → fetch (auto) → 200 → setData → success] →
cacheUpdated → port → UI hydrate

Why this is orthogonal to #8530

The ongoing discussion about blocking QueryClient write APIs in the UI (setQueryDefaults, setQueryData, ensureQueryData) is on a different surface:

  • Read-only. queryCache.subscribe(callback) observes lifecycle events (added / removed, updated for pending → fetching → success/error/idle, observerAdded / observerRemoved as the dedupe signal, observerResultsUpdated post-structural-sharing). It cannot mutate state, override queryFn, change defaults, or do any of the things Block QueryClient API surface in UI #8530 is concerned with. Same surface TanStack Query DevTools uses.
  • Background-only. Instrumentation runs in the service worker where the QueryClient lives. UI consumers are never exposed to the client or to the cache subscription.
  • No overlap with the #8531 carve-outs. Optimistic updates (setQueryData) and route loaders (ensureQueryData) are write-paths. Cache subscription is read-only and doesn't touch them.

Fallback

If this is rejected, decisions#131 already includes an "Observability" item in Option A's Cons flagging the visibility gap, so future reviewers don't rediscover it. No code change required in that case.


Acceptance Criteria

  • BaseDataService exposes either a public queryClient getter or an onCreate constructor hook (maintainer's choice).
  • Downstream queryCache.subscribe(...) wiring composes cleanly from a client-side Sentry init file. A prototype in the extension repo demonstrates the full stack trace from useQuery through retry through success.
  • No new UI-facing surface.
  • No change to OmitKeyof type erasure.
  • No change to messenger contract.
  • No change to proxy queryFn semantics.

Labels

area-data-service, area-observability


Dependencies

Related: #8530, #8531 (orthogonal — same epic scope but different surface; not blocking).
Depends on: none.
Dependents: none.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions