Skip to content

fix(observability): propagate component span into spawned tasks#25521

Open
gwenaskell wants to merge 2 commits into
masterfrom
yoenn.burban/no-spawn-without-instrument
Open

fix(observability): propagate component span into spawned tasks#25521
gwenaskell wants to merge 2 commits into
masterfrom
yoenn.burban/no-spawn-without-instrument

Conversation

@gwenaskell
Copy link
Copy Markdown
Contributor

@gwenaskell gwenaskell commented May 29, 2026

Summary

Several components spawn background tokio tasks (or run work on detached tasks via concurrent_map) without carrying the current tracing span. Vector's internal metrics/logs derive their component tags (component_id, component_kind, component_type) from that span, so any telemetry emitted from those tasks was missing its component tags.

This PR preserves the owning component's span across the spawn boundary, by wrapping spawned futures with in_current_span() (and, for the Datadog logs/metrics sinks that drive concurrent_map directly, instrumenting the mapped future inside the closure where the sink span is current).

Notes:

  • ConcurrentMap, and spawn_named, do not instrument any spawned future themselves — more generally, the responsibility of instrumenting a future should lie with the code that declares the future, not the one that spawns it
  • The fanout detached-send task is intentionally left un-instrumented, with a comment: it drains a sink that was just detached from the topology, so it is unrelated to the upstream component that owns the fanout; instrumenting it would mis-tag the work with the wrong component.

Examples of internal telemetry that was previously mis-tagged and is now correct

  • datadog_logs sinkcomponent_discarded_events_total (via ComponentEventsDropped, emitted when an event is too large to encode) is produced inside the request-builder future spawned by concurrent_map. Previously it was emitted with no component tags; now it carries the sink's component_id/component_kind/component_type.
  • gcp_pubsub source — each pull stream runs on a per-stream task spawned in start_one. Errors emitted there (e.g. StreamClosedErrorcomponent_errors_total + component_discarded_events_total, and connection/streaming errors → component_errors_total) were untagged and are now attributed to the source component.
  • splunk_hec sinks — the indexer-acknowledgement handling task emits SplunkResponseParseError / SplunkIndexerAcknowledgementUnavailableError (component_errors_total); these were untagged and are now correctly attributed to the sink.

Vector configuration

N/A — no configuration changes.

How did you test this PR?

  • cargo check -p vector --lib passes.
  • Manual audit of each spawn site to confirm the captured span belongs to the spawned work (and to identify the fanout case that should not be instrumented).

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

@github-actions github-actions Bot added domain: sources Anything related to the Vector's sources domain: transforms Anything related to Vector's transform components domain: sinks Anything related to the Vector's sinks domain: core Anything related to core crates i.e. vector-core, core-common, etc labels May 29, 2026
Comment thread changelog.d/spawned_task_component_tags.fix.md Fixed
Several components spawn background `tokio` tasks without carrying the
current tracing span, so internal metrics/logs emitted from that work
lost their component tags (component_id, component_kind, component_type).

Wrap spawned futures with `in_current_span()` (or, for the Datadog logs
and metrics sinks driving `concurrent_map`, instrument the mapped future
inside the closure) so the owning component's span is preserved.

`ConcurrentMap` itself no longer instruments the spawned future; that is
now the caller's responsibility, documented inline. The fanout detached
send is intentionally left un-instrumented (it drains a sink unrelated to
the upstream component that owns the fanout), with a comment explaining
why.

Signed-off-by: Yoenn Burban <yoenn.burban@datadoghq.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@gwenaskell gwenaskell force-pushed the yoenn.burban/no-spawn-without-instrument branch from 9b0dedf to 7eeacdb Compare May 29, 2026 13:34
@gwenaskell gwenaskell marked this pull request as ready for review May 29, 2026 14:07
@gwenaskell gwenaskell requested a review from a team as a code owner May 29, 2026 14:07
@gwenaskell gwenaskell requested a review from bruceg May 29, 2026 14:32
@bruceg bruceg added type: bug A code related bug. domain: observability Anything related to monitoring/observing Vector labels May 29, 2026
Copy link
Copy Markdown
Member

@bruceg bruceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread src/api/grpc_server.rs
error_source = ?e.source(),
bind_addr = %actual_addr,
);
tokio::spawn(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to have a spawn_in_current_task wrapper for this common operation? It's a relatively trivial amount of code duplication, but it would simplify searching for spawns that are missing spans and potentially even block calling tokio::spawn directly without proper annotations.

@pront pront removed the type: bug A code related bug. label May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: core Anything related to core crates i.e. vector-core, core-common, etc domain: observability Anything related to monitoring/observing Vector domain: sinks Anything related to the Vector's sinks domain: sources Anything related to the Vector's sources domain: transforms Anything related to Vector's transform components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants