fix(observability): propagate component span into spawned tasks#25521
Open
gwenaskell wants to merge 2 commits into
Open
fix(observability): propagate component span into spawned tasks#25521gwenaskell wants to merge 2 commits into
gwenaskell wants to merge 2 commits into
Conversation
Several components spawn background `tokio` tasks without carrying the current tracing span, so internal metrics/logs emitted from that work lost their component tags (component_id, component_kind, component_type). Wrap spawned futures with `in_current_span()` (or, for the Datadog logs and metrics sinks driving `concurrent_map`, instrument the mapped future inside the closure) so the owning component's span is preserved. `ConcurrentMap` itself no longer instruments the spawned future; that is now the caller's responsibility, documented inline. The fanout detached send is intentionally left un-instrumented (it drains a sink unrelated to the upstream component that owns the fanout), with a comment explaining why. Signed-off-by: Yoenn Burban <yoenn.burban@datadoghq.com> Co-authored-by: Cursor <cursoragent@cursor.com>
9b0dedf to
7eeacdb
Compare
bruceg
approved these changes
May 29, 2026
| error_source = ?e.source(), | ||
| bind_addr = %actual_addr, | ||
| ); | ||
| tokio::spawn( |
Member
There was a problem hiding this comment.
Would it be helpful to have a spawn_in_current_task wrapper for this common operation? It's a relatively trivial amount of code duplication, but it would simplify searching for spawns that are missing spans and potentially even block calling tokio::spawn directly without proper annotations.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Several components spawn background
tokiotasks (or run work on detached tasks viaconcurrent_map) without carrying the current tracing span. Vector's internal metrics/logs derive their component tags (component_id,component_kind,component_type) from that span, so any telemetry emitted from those tasks was missing its component tags.This PR preserves the owning component's span across the spawn boundary, by wrapping spawned futures with
in_current_span()(and, for the Datadog logs/metrics sinks that driveconcurrent_mapdirectly, instrumenting the mapped future inside the closure where the sink span is current).Notes:
ConcurrentMap, andspawn_named, do not instrument any spawned future themselves — more generally, the responsibility of instrumenting a future should lie with the code that declares the future, not the one that spawns itfanoutdetached-send task is intentionally left un-instrumented, with a comment: it drains a sink that was just detached from the topology, so it is unrelated to the upstream component that owns the fanout; instrumenting it would mis-tag the work with the wrong component.Examples of internal telemetry that was previously mis-tagged and is now correct
datadog_logssink —component_discarded_events_total(viaComponentEventsDropped, emitted when an event is too large to encode) is produced inside the request-builder future spawned byconcurrent_map. Previously it was emitted with no component tags; now it carries the sink'scomponent_id/component_kind/component_type.gcp_pubsubsource — each pull stream runs on a per-stream task spawned instart_one. Errors emitted there (e.g.StreamClosedError→component_errors_total+component_discarded_events_total, and connection/streaming errors →component_errors_total) were untagged and are now attributed to the source component.splunk_hecsinks — the indexer-acknowledgement handling task emitsSplunkResponseParseError/SplunkIndexerAcknowledgementUnavailableError(component_errors_total); these were untagged and are now correctly attributed to the sink.Vector configuration
N/A — no configuration changes.
How did you test this PR?
cargo check -p vector --libpasses.fanoutcase that should not be instrumented).Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.