Skip to content

core, opentelemetry: Implement Attempt-Level RPC Delay Observability (Proposal A121)#12807

Open
AgraVator wants to merge 16 commits into
grpc:masterfrom
AgraVator:lb-policy-delay
Open

core, opentelemetry: Implement Attempt-Level RPC Delay Observability (Proposal A121)#12807
AgraVator wants to merge 16 commits into
grpc:masterfrom
AgraVator:lb-policy-delay

Conversation

@AgraVator

@AgraVator AgraVator commented May 13, 2026

Copy link
Copy Markdown
Contributor

This PR implements Attempt-Level RPC Delay Observability across the core channel transport, built-in load balancers, xDS policies, and the OpenTelemetry telemetry plugin, fully aligned with gRPC Proposal A121.

(Note: Name resolution / Call-Level channel initialization delay observability is cleanly isolated in a dedicated companion PR).


What Changed

  • api (ClientStreamTracer): Added attempt-scoped queuing delay observability hooks:
    • recordAttemptDelayStart(String delayType, String delayReason)
    • recordAttemptDelayReasonChanged(String delayReason)
    • recordAttemptDelayEnd()
  • api (LoadBalancer.PickResult): Added canonical low-cardinality delayType and high-cardinality diagnostic delayReason fields, along with factory method withNoResult(String delayType, String delayReason).
  • core (DelayedClientTransport): Updated PendingStream to track active delay categorization and diagnostic strings. Implemented updateDelay(...) enforcing Active Span Retention: when delayType remains identical across LB state transitions (e.g., priority policy failover), the active segment continues without stopwatch reset or span re-creation.
  • opentelemetry (OpenTelemetry*Module): Implemented duration recording and tracing:
    • Exported attempt queuing duration to the grpc.client.attempt.delay.duration histogram (seconds).
    • Created "Attempt Delay" child tracing spans carrying grpc.delay_type.
    • Recorded granular runtime diagnostics as structured "Delay state transition" span events carrying explicit grpc.delay_type and grpc.delay_reason attributes.
    • Guarded all telemetry and channel hooks under the GRPC_EXPERIMENTAL_ENABLE_DELAY_OBSERVABILITY feature flag (default false).
  • core / util / xds LB Policies:
    • PickFirst / RoundRobin: Emits canonical "connecting" delay types and diagnostic reasons when buffering picks.
    • RingHash / RLS / CDS: Emits explicit control-plane and resolution delay attributes ("rls_lookup_pending", "cds_dynamic_discovery").
    • PriorityLoadBalancer: Composes child delay reasons with pX: prefixes (p0:connecting, p1:connecting) to clearly trace priority failover transitions.

AgraVator added 2 commits May 13, 2026 22:15
This commit implements the plumbing required to propagate delay reason tokens from load balancing policies up to the transport layer and tracers, as specified in the LB policy delay design.
connectivityState = newState;
picker = newPicker;
if (newState == CONNECTING || newState == IDLE) {
picker = new PriorityPicker(newPicker, priority);

@AgraVator AgraVator May 19, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

appends "px:" to the child delay_type

@AgraVator AgraVator marked this pull request as ready for review June 8, 2026 11:13
@AgraVator AgraVator requested review from ejona86 and shivaspeaks June 8, 2026 11:14

@shivaspeaks shivaspeaks left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review, LGTM overall. I'll take a deeper look with implementation doc when I'll be back in office.

This change looks to be something that should be consistent in all the languages. Is there a gRFC baking for this? If so link that PR in description?

PickResult childResult = delegate.pickSubchannel(args);
if (!childResult.hasResult() && childResult.getDelayReasonToken() != null) {
return PickResult.withNoResult(
"priority_" + priority + ":" + childResult.getDelayReasonToken());

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question to understand this better from performance perspective.
This string concatenation happens on the hot path for every buffered RPC. If the priority tree is deep or policies are nested, this may lead to,
Allocation Overhead- repeated string and PickResult allocations on every pickSubchannel call.
Metric Cardinality- These nested tokens (e.g., priority_p0:priority_p1:ring_hash:connecting) are used as metric labels. Highly nested tokens can cause a cardinality explosion.

Is there a way we can cache the concatenated PickResult in the PriorityPicker (if the child's result is also cached/static) to avoid per-pick allocations? I assume we need the new childResult's DelayReasonToken, I'm not sure if that stays static or is dynamic. If it stays static then we can move out or else we should at least create "priority_" + priority + ":" + statically?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"delayType" will be very limited, only priorities will be appended and that will lead to some cardinality but not too much.
The delayReason will be dynamic
The gRFC was raised today itself, have attached the same to the description.

…dence invariants

- Refactor ClientStreamTracer to expose delayTypeStarted(String) and delayReasonAttached(String)
- Enhance PickResult with separate delayType and delayReason diagnostic fields
- Implement Mark Roth's hybrid telemetry cadence model in DelayedClientTransport.PendingStream
- Support channel fallback delay states (client_channel_init, subchannel_state_mismatch, wait_for_ready_failed)
- Simplify leaf and container LB policies to emit canonical unified connecting metric labels
# Conflicts:
#	core/src/main/java/io/grpc/internal/PickFirstLeafLoadBalancer.java
@AgraVator AgraVator changed the title core: Implement load balancing policy delay plumbing core, opentelemetry: Implement Attempt-Level RPC Delay Observability (Proposal A121) Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants