feat(otel): Implement distributed tracing#3218
feat(otel): Implement distributed tracing#3218logaretm wants to merge 4 commits intoredis:masterfrom
Conversation
Adds built-in OpenTelemetry distributed tracing (spans) alongside the existing metrics implementation. Spans are created by subscribing to the same TracingChannels that metrics use, keeping the core code unchanged. Context propagation uses the bindStore hack on OTel's internal AsyncLocalStorage so that parent-child span relationships work automatically (e.g. MULTI > individual commands). Span coverage: - Command spans (SET, GET, etc.) with db.query.text, db.namespace - Batch spans (MULTI/PIPELINE) with db.operation.batch.size - Connection + connection wait spans (opt-in via config) - Error attributes: error.type, db.response.status_code All attributes follow OTel semantic conventions for database spans. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds OpenTelemetry distributed tracing support to the @redis/client package by subscribing to the existing Diagnostics Channel TracingChannel events (commands, batches, and optionally connect), alongside docs/tests/examples to validate and demonstrate span output.
Changes:
- Introduces
OTelTracingimplementation that creates OTel spans from node-redis tracing channels (command, batch, connect). - Extends OTel config/types to include
TracingConfigand adds new semantic attribute (db.operation.batch.size). - Adds tracing documentation + runnable example, and a new test suite validating span creation, error recording, and parent/child relationships.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Updates OpenTelemetry section to include tracing and links to tracing docs/examples. |
| packages/client/package.json | Adds dev dependencies needed for tracing tests/examples (context manager + trace SDK base). |
| packages/client/lib/opentelemetry/types.ts | Adds TracingConfig and db.operation.batch.size attribute constant. |
| packages/client/lib/opentelemetry/tracing.ts | New tracing subscriber that creates spans from Diagnostics Channel TracingChannel. |
| packages/client/lib/opentelemetry/tracing.spec.ts | New tests covering command/batch/connect spans, filtering, and context propagation. |
| packages/client/lib/opentelemetry/index.ts | Wires tracing into OpenTelemetry.init when enabled and exports tracing types/class. |
| package-lock.json | Lockfile updates for added dependencies. |
| examples/README.md | Adds otel-tracing.js to examples list. |
| examples/otel-tracing.js | New runnable tracing example. |
| docs/otel-tracing.md | New tracing documentation and configuration reference. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| static init({ | ||
| api, | ||
| config, | ||
| }: { | ||
| api: typeof import('@opentelemetry/api'); | ||
| config?: TracingConfig; | ||
| }) { | ||
| if (OTelTracing.#initialized) return; | ||
| OTelTracing.#instance = new OTelTracing(api, config); | ||
| OTelTracing.#initialized = true; | ||
| } |
There was a problem hiding this comment.
TracingConfig.enabled is not respected inside OTelTracing.init/constructor (spans will be subscribed/created even if someone calls OTelTracing.init({ ..., config: { enabled: false } })). Either enforce the flag in OTelTracing.init (return early when disabled) or remove enabled from TracingConfig to avoid a misleading configuration surface.
| if (this.#otelStorage) { | ||
| // Bind OTel's ALS to the start channel for context propagation | ||
| (tc as any).start.bindStore(this.#otelStorage, (data: WithSpan & any) => { | ||
| const desc = onStart(data); | ||
| if (!desc) return context.active(); | ||
|
|
||
| const parentCtx = context.active(); | ||
| const span = this.#tracer.startSpan(desc.name, desc.options, parentCtx); | ||
| data[SPAN_KEY] = span; | ||
| return trace.setSpan(parentCtx, span); | ||
| }); |
There was a problem hiding this comment.
tc.start.bindStore / unbindStore are invoked via any without checking they exist. On Node versions where TracingChannel exists but bindStore/unbindStore are unavailable or have different semantics, this will throw during initialization/teardown. Add a runtime guard (e.g., check for functions before calling) and fall back to the WeakMap approach when binding isn’t supported.
| ctx[SPAN_KEY]?.end(); | ||
| }, | ||
| error: (ctx: WithSpan & { error?: Error }) => { | ||
| const span = ctx[SPAN_KEY]; | ||
| if (span) { | ||
| this.#endSpanWithError(span, ctx.error!); |
There was a problem hiding this comment.
In the bound-store path, error uses ctx.error! and asyncEnd/error leave the span attached to the context object. If ctx.error is unexpectedly missing this will throw, and retaining the ended span on ctx can keep references alive longer than needed. Prefer guarding when error is falsy and deleting ctx[SPAN_KEY] after ending to reduce retention.
| ctx[SPAN_KEY]?.end(); | |
| }, | |
| error: (ctx: WithSpan & { error?: Error }) => { | |
| const span = ctx[SPAN_KEY]; | |
| if (span) { | |
| this.#endSpanWithError(span, ctx.error!); | |
| const span = ctx[SPAN_KEY]; | |
| if (span) { | |
| span.end(); | |
| delete ctx[SPAN_KEY]; | |
| } | |
| }, | |
| error: (ctx: WithSpan & { error?: Error }) => { | |
| const span = ctx[SPAN_KEY]; | |
| if (span) { | |
| const error = ctx.error; | |
| if (error) { | |
| this.#endSpanWithError(span, error); | |
| } else { | |
| span.end(); | |
| } | |
| delete ctx[SPAN_KEY]; |
| ], | ||
| }); | ||
|
|
||
| context.setGlobalContextManager(new AsyncLocalStorageContextManager()); |
There was a problem hiding this comment.
AsyncLocalStorageContextManager needs to be enabled (e.g. new AsyncLocalStorageContextManager().enable()) before setting it as the global context manager; otherwise context propagation (and therefore parent/child span relationships) may not work as documented.
| context.setGlobalContextManager(new AsyncLocalStorageContextManager()); | |
| const contextManager = new AsyncLocalStorageContextManager().enable(); | |
| context.setGlobalContextManager(contextManager); |
| ], | ||
| }); | ||
|
|
||
| context.setGlobalContextManager(new AsyncLocalStorageContextManager()); |
There was a problem hiding this comment.
The AsyncLocalStorageContextManager snippet is missing .enable(). Without enabling, OpenTelemetry context propagation can be disabled, so Redis spans may not become children of active application spans as described later in this doc.
| context.setGlobalContextManager(new AsyncLocalStorageContextManager()); | |
| context.setGlobalContextManager( | |
| new AsyncLocalStorageContextManager().enable() | |
| ); |
| if (span) { | ||
| this.#endSpanWithError(span, ctx.error!); | ||
| } | ||
| }, |
There was a problem hiding this comment.
Span ended twice on error in ALS path
Medium Severity
In the ALS code path, when a traced operation rejects, Node.js TracingChannel.tracePromise publishes both the error and asyncEnd events. The error handler calls #endSpanWithError, which calls span.end(), and then the asyncEnd handler calls ctx[SPAN_KEY]?.end() again — double-ending the span. The WeakMap fallback path correctly avoids this by calling spans.delete(ctx) in the error handler, but the ALS path never clears ctx[SPAN_KEY]. This produces an OTel SDK warning ("Can not end a Span that has already been ended") for every errored Redis command.
Additional Locations (1)
| includeCommands: Record<string, true>; | ||
| excludeCommands: Record<string, true>; | ||
| hasIncludeCommands: boolean; | ||
| hasExcludeCommands: boolean; |
There was a problem hiding this comment.
Field hasExcludeCommands is assigned but never read
Low Severity
hasExcludeCommands is declared in TracingOptions, computed and assigned in the constructor, but never read anywhere. The #isCommandExcluded method only uses hasIncludeCommands, includeCommands, and excludeCommands — the hasExcludeCommands shortcut is dead code.
Additional Locations (1)
|
@logaretm thank you, I'm gonna take a look |


This PR builds on #3195 by adding OTEL traces implementation by consuming the events emitted by the tracing channels, you can test it out with a simple tool I built.
I followed the implementation spirit of the OTEL metrics, so it gets enabled with a config and works in a very similar manner to metrics, I added the neccessary tests and documentations as well.
I tested a OTEL exporter app and it produced the spans you see above when running:
Note
Medium Risk
Adds new tracing instrumentation hooked into
diagnostics_channelTracingChannels and relies on OpenTelemetry internal AsyncLocalStorage binding for context propagation, which could affect span correctness/performance in opt-in paths.Overview
Adds OpenTelemetry tracing support alongside existing metrics by introducing
OTelTracingand wiring it intoOpenTelemetry.init({ tracing: ... })with configurable command filtering, optional connection spans, and span error attribution.Implements span creation for commands, batch operations (MULTI/PIPELINE) with parent/child relationships, and optional
connect()spans, plus a new test suite validating span attributes and lifecycle.Updates docs and examples to cover tracing (
docs/otel-tracing.md,examples/otel-tracing.js) and refreshes package dev deps/lockfile to include tracing SDK/context packages.Written by Cursor Bugbot for commit 4337ded. This will update automatically on new commits. Configure here.