You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Parent Epic: None (standalone) File:packages/base-data-service/src/BaseDataService.ts Size: S | Hours: 2–4
Problem
Option A's proxy model (TanStack Query QueryClient in the background process, proxy queryFn in the UI) intentionally hides the cache, dedupe, retry, fetch, and sync chain from UI-perceived latency tracing. The distributed-tracing wrappers in metamask-extension#39891 give us the call boundaries (useQuery → submitRequestToBackground → messenger → BaseDataService action → response). Inside the opaque region, every interesting decision is invisible:
Cache hit vs. miss. Most data service calls don't make a network request. Auto-instrumented fetch spans see zero. The query looks "slow" with no attribution.
In-flight dedupe. N concurrent observers collapse to one fetch. The fetch span doesn't know N.
Cockatiel retry sequences. A single logical "fetch this" can produce 0, 1, or many HTTP requests. Auto-spans see them as unrelated. Backoff intervals are invisible.
Circuit breaker state. When open, no fetch happens. The query fails or returns cached data with no auto-span at all.
Cache sync to UI.setData → cacheUpdated event → Port.postMessage → UI hydrate is the post-fetch synchronization cost. None of it surfaces as a span.
Trigger attribution. Poll interval, user invalidation, observer subscribe, and WebSocket invalidation are all identical from the network's view.
Three Option A tradeoffs documented in decisions#131 currently depend on observability that doesn't exist:
Per-screen staleTime. Validating that the service-side workarounds hold in production requires distinguishing fetches from cache serves.
refetchOnMount / refetchOnWindowFocus collapse. Measuring whether concurrent queries amplify messenger volume or are absorbed by background staleTime.
Split-process staleness indirection. Correlating UI proxy calls with background fetches.
Without this visibility, the tradeoffs are theoretical. With it, they're verifiable.
Solution
Add a sanctioned extension point to BaseDataService that exposes the internal QueryClient for read-only observability subscriptions. Two variants; preference is maintainer's call:
Either is ~3 lines in core. Downstream clients attach their own instrumentation externally (e.g. qc.getQueryCache().subscribe(emitSpan) plus Cockatiel onRetry / onBreak handlers in extension's Sentry init, ~50 lines). The full span chain becomes stitchable end-to-end:
The ongoing discussion about blocking QueryClient write APIs in the UI (setQueryDefaults, setQueryData, ensureQueryData) is on a different surface:
Read-only.queryCache.subscribe(callback) observes lifecycle events (added / removed, updated for pending → fetching → success/error/idle, observerAdded / observerRemoved as the dedupe signal, observerResultsUpdated post-structural-sharing). It cannot mutate state, override queryFn, change defaults, or do any of the things Block QueryClient API surface in UI #8530 is concerned with. Same surface TanStack Query DevTools uses.
Background-only. Instrumentation runs in the service worker where the QueryClient lives. UI consumers are never exposed to the client or to the cache subscription.
No overlap with the #8531 carve-outs. Optimistic updates (setQueryData) and route loaders (ensureQueryData) are write-paths. Cache subscription is read-only and doesn't touch them.
Fallback
If this is rejected, decisions#131 already includes an "Observability" item in Option A's Cons flagging the visibility gap, so future reviewers don't rediscover it. No code change required in that case.
Acceptance Criteria
BaseDataService exposes either a public queryClient getter or an onCreate constructor hook (maintainer's choice).
Downstream queryCache.subscribe(...) wiring composes cleanly from a client-side Sentry init file. A prototype in the extension repo demonstrates the full stack trace from useQuery through retry through success.
No new UI-facing surface.
No change to OmitKeyof type erasure.
No change to messenger contract.
No change to proxy queryFn semantics.
Labels
area-data-service, area-observability
Dependencies
Related:#8530, #8531 (orthogonal — same epic scope but different surface; not blocking). Depends on: none. Dependents: none.
Parent Epic: None (standalone)
File:
packages/base-data-service/src/BaseDataService.tsSize: S | Hours: 2–4
Problem
Option A's proxy model (TanStack Query
QueryClientin the background process, proxyqueryFnin the UI) intentionally hides the cache, dedupe, retry, fetch, and sync chain from UI-perceived latency tracing. The distributed-tracing wrappers in metamask-extension#39891 give us the call boundaries (useQuery → submitRequestToBackground → messenger → BaseDataService action → response). Inside the opaque region, every interesting decision is invisible:setData → cacheUpdated event → Port.postMessage → UI hydrateis the post-fetch synchronization cost. None of it surfaces as a span.Three Option A tradeoffs documented in decisions#131 currently depend on observability that doesn't exist:
staleTime. Validating that the service-side workarounds hold in production requires distinguishing fetches from cache serves.refetchOnMount/refetchOnWindowFocuscollapse. Measuring whether concurrent queries amplify messenger volume or are absorbed by backgroundstaleTime.Without this visibility, the tradeoffs are theoretical. With it, they're verifiable.
Solution
Add a sanctioned extension point to
BaseDataServicethat exposes the internalQueryClientfor read-only observability subscriptions. Two variants; preference is maintainer's call:get queryClient(): QueryClientonCreate?: (qc: QueryClient) => voidEither is ~3 lines in core. Downstream clients attach their own instrumentation externally (e.g.
qc.getQueryCache().subscribe(emitSpan)plus CockatielonRetry/onBreakhandlers in extension's Sentry init, ~50 lines). The full span chain becomes stitchable end-to-end:Why this is orthogonal to #8530
The ongoing discussion about blocking
QueryClientwrite APIs in the UI (setQueryDefaults,setQueryData,ensureQueryData) is on a different surface:queryCache.subscribe(callback)observes lifecycle events (added/removed,updatedforpending → fetching → success/error/idle,observerAdded/observerRemovedas the dedupe signal,observerResultsUpdatedpost-structural-sharing). It cannot mutate state, overridequeryFn, change defaults, or do any of the things BlockQueryClientAPI surface in UI #8530 is concerned with. Same surface TanStack Query DevTools uses.QueryClientlives. UI consumers are never exposed to the client or to the cache subscription.setQueryData) and route loaders (ensureQueryData) are write-paths. Cache subscription is read-only and doesn't touch them.Fallback
If this is rejected, decisions#131 already includes an "Observability" item in Option A's Cons flagging the visibility gap, so future reviewers don't rediscover it. No code change required in that case.
Acceptance Criteria
BaseDataServiceexposes either a publicqueryClientgetter or anonCreateconstructor hook (maintainer's choice).queryCache.subscribe(...)wiring composes cleanly from a client-side Sentry init file. A prototype in the extension repo demonstrates the full stack trace fromuseQuerythrough retry through success.OmitKeyoftype erasure.queryFnsemantics.Labels
area-data-service,area-observabilityDependencies
Related: #8530, #8531 (orthogonal — same epic scope but different surface; not blocking).
Depends on: none.
Dependents: none.