Skip to content

CAMEL-23225: Remove ThreadLocal context fallback from camel-opentelemetry2#22255

Closed
gnodet wants to merge 2 commits intomainfrom
CAMEL-23225-remove-threadlocal-context-fallback
Closed

CAMEL-23225: Remove ThreadLocal context fallback from camel-opentelemetry2#22255
gnodet wants to merge 2 commits intomainfrom
CAMEL-23225-remove-threadlocal-context-fallback

Conversation

@gnodet
Copy link
Contributor

@gnodet gnodet commented Mar 25, 2026

Summary

Per the tracing design doc (proposals/tracing.adoc), context propagation should be handled exclusively via Exchange headers (W3C traceparent), not ThreadLocal. This PR removes the ThreadLocal-based context propagation mechanisms from camel-opentelemetry2:

  • OpenTelemetryTracer.create(): When no Camel parent span exists, always use Context.root() as the base for extract(). Removed the BAGGAGE_CAMEL_FLAG / "dirty context" detection that fell back to Context.current() — the extract() call on Exchange headers is sufficient to pick up upstream traces.
  • OpenTelemetrySpanAdapter: Removed BAGGAGE_CAMEL_FLAG constant, baggage flag injection in constructor, and baggageScope management. Span scope (makeCurrent()) is kept as it's needed by inject().
  • Deleted OpenTelemetryInstrumentedThreadPoolFactory and OpenTelemetryInstrumentedThreadFactoryListener: These wrapped Camel thread pools with Context.taskWrapping() to propagate ThreadLocal context across threads — contradicting the design principle that context flows via Exchange headers.

This is an alternative approach to the Spring Boot TaskDecorator in apache/camel-spring-boot#1730 — rather than adding more ThreadLocal propagation infrastructure, this follows the design doc's principle of Exchange-header-only propagation. If this approach is accepted, apache/camel-spring-boot#1730 should be closed as not needed.

Test plan

  • All 12 existing tests pass (mvn test -B -pl components/camel-opentelemetry2) — they use header-based propagation
  • Code formatted (mvn formatter:format impsort:sort)

Claude Code on behalf of Guillaume Nodet

🤖 Generated with Claude Code

…etry2

- Remove BAGGAGE_CAMEL_FLAG / dirty context detection in create() method,
  always use Context.root() as base for extract()
- Remove baggage flag logic and baggageScope from OpenTelemetrySpanAdapter
- Delete OpenTelemetryInstrumentedThreadPoolFactory and
  OpenTelemetryInstrumentedThreadFactoryListener (Context.taskWrapping)
- Remove corresponding META-INF service files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🌟 Thank you for your contribution to the Apache Camel project! 🌟
🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

  • First-time contributors require MANUAL approval for the GitHub Actions to run
  • You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot although they are normally detected and executed by CI.
  • You can label PRs using build-all, build-dependents, skip-tests and test-dependents to fine-tune the checks executed by this PR.
  • Build and test logs are available in the summary page. Only Apache Camel committers have access to the summary.

⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

@gnodet gnodet requested a review from squakez March 25, 2026 09:53
@gnodet gnodet marked this pull request as draft March 25, 2026 09:54
@gnodet
Copy link
Contributor Author

gnodet commented Mar 25, 2026

Analysis of removed code — historical context

Checked git history to verify we're not removing something added for a good reason. Here's what I found:

Key commits

  1. CAMEL-21202 (Oct 2024, Adriano Machado) — Originally added OpenTelemetryInstrumentedThreadPoolFactory and ThreadFactoryListener to the old camel-opentelemetry module (now deprecated) to propagate OTel context across thread boundaries via Context.taskWrapping().

  2. CAMEL-22648 (Jan 2026, Pasquale Congiusti, 3a751cf) — Added the BAGGAGE_CAMEL_FLAG / dirty context detection to camel-opentelemetry2. Commit message: "We have introduced a flag which we use to control whether a span was generated from Camel or not. This is useful when we reuse 'dirty' threads which were previously asynchronously setting otel context via thread locals. By checking the flag we can decide to continue propagation of a trace from an existing span (for example, when vertx or otel agents are injecting a span before the Camel process kicks in)."

  3. 5810a742 (today) — Ported thread pool instrumentation from deprecated module to camel-opentelemetry2.

  4. f60675cb (today, Aurélien Pupier) — Reverted that port.

Risk assessment

The BAGGAGE_CAMEL_FLAG was added deliberately to handle a real scenario: when a third-party component (like Vert.x or an OTel Java agent) sets a span in Context.current() before Camel starts processing. Without the flag, Camel would ignore that external root span.

By always using Context.root(), we ignore any span that an OTel Java agent or Vert.x sets in Context.current() before Camel processes a message. Those external spans would not become parents of the Camel trace — Camel would start a disconnected new trace instead.

However, per the design doc (proposals/tracing.adoc), the intended way to propagate that context is through Exchange headers (traceparent), not ThreadLocal. If Vert.x or an OTel agent wants to link to Camel, they should inject the traceparent header into the Exchange, which extract() will pick up from the headers.

Next step: Investigating whether the callers (e.g., Camel Vert.x HTTP component, platform-http) already inject traceparent headers into the Exchange, or if they only rely on Context.current(). If the latter, those callers would need to be updated to inject headers for this approach to work end-to-end.

Claude Code on behalf of Guillaume Nodet

@squakez
Copy link
Contributor

squakez commented Mar 25, 2026

By always using Context.root(), we ignore any span that an OTel Java agent or Vert.x sets in Context.current() before Camel processes a message. Those external spans would not become parents of the Camel trace — Camel would start a disconnected new trace instead.

This is not true: #20720 has fixed that problem.

- Rename SpanInjection → SpanInjectionTest so surefire picks it up
  (was silently skipped since it didn't match *Test naming convention)
- Use traceparent Exchange header instead of ThreadLocal Context.current()
  to simulate external parent spans, per the design doc
- Use setNoParent() when creating test spans to avoid inheriting stale
  Context.current() state from other test classes (pre-existing issue)
- Clear exporter between test methods to prevent inter-test leakage

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gnodet
Copy link
Contributor Author

gnodet commented Mar 25, 2026

Investigation results

Investigated the callers to verify Exchange-header-based propagation works end-to-end:

How context flows in production

  1. HTTP request arrives with traceparent header → Platform HTTP Vertx (VertxPlatformHttpSupport.populateCamelHeaders()) / Netty HTTP (DefaultNettyHttpBinding.populateCamelHeaders()) copy ALL HTTP headers into Exchange message headers
  2. Camel Tracer creates first span via beginEventSpan()spanStorageManager.peek(exchange) returns null for new exchange → parent == null branch → extract(Context.root(), exchange.getIn().getHeaders()) picks up traceparent from headers
  3. Subsequent spans within the route use the Camel-managed span stack (Exchange-based), not ThreadLocal

This confirms the design doc's approach works: context propagation via Exchange headers is already the path used by all HTTP components.

SpanInjection test fix

The SpanInjection test was:

  1. Never actually running — class name didn't match surefire's *Test convention
  2. Already broken before our changes — when run after AsyncCXFTest, it failed even with the original BAGGAGE_CAMEL_FLAG code
  3. Testing the wrong thing — used span.makeCurrent() (ThreadLocal) instead of traceparent header to simulate external parent spans

Root cause of the inter-test failure: AsyncCXFTest leaks a span into Context.current() on the test thread. The test's spanBuilder("mySpan").startSpan() then inherits this stale trace ID (OTel default behavior), causing all 10 exchanges to share the same trace.

Fix applied:

  • Renamed to SpanInjectionTest (now runs in CI — adds 2 tests to the suite)
  • Uses traceparent Exchange header instead of ThreadLocal
  • Uses setNoParent() on test span builders to avoid stale Context.current() contamination
  • All 14 tests pass including after async tests

Claude Code on behalf of Guillaume Nodet

@gnodet
Copy link
Contributor Author

gnodet commented Mar 25, 2026

Closing this PR

After investigation and feedback from @squakez (the camel-opentelemetry2 author), this PR is incorrect:

  1. The `BAGGAGE_CAMEL_FLAG` / `Context.current()` fallback is intentional — it handles the case where an external component (Vert.x, OTel Java agent, Quarkus) sets a span in `Context.current()` before Camel processes a message via direct in-process call (no HTTP boundary, no `traceparent` header). Removing it would break trace linkage in this scenario. See PR #20720.

  2. The thread pool wrapping (`Context.taskWrapping()`) is not needed — the design explicitly uses Exchange headers for context propagation, not ThreadLocal. This was already correctly reverted in Revert "CAMEL-23225: Propagate OpenTelemetry context across thread boundaries in camel-opentelemetry2" #22254.

  3. The JIRA (CAMEL-23225) appears invalid — commented on the issue recommending closure. The `camel-opentelemetry2` design handles cross-thread propagation via Exchange headers, and the `BAGGAGE_CAMEL_FLAG` handles the in-process edge case. No Spring Boot `TaskDecorator` is needed either.

The one valid fix from this PR: the `SpanInjection` test was never running (class name didn't match surefire `*Test` convention) and tested ThreadLocal propagation instead of header-based. Will submit that as a separate small PR if desired.

Claude Code on behalf of Guillaume Nodet

@gnodet gnodet closed this Mar 25, 2026
@gnodet gnodet deleted the CAMEL-23225-remove-threadlocal-context-fallback branch March 25, 2026 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants