[Service Bus] Fix trace context not propagated on first sendMessage() (#44958)#49600
[Service Bus] Fix trace context not propagated on first sendMessage() (#44958)#49600ksalazar-91 wants to merge 3 commits into
Conversation
…Azure#44958) The first call to ServiceBusSenderClient.sendMessage() (and the async client) did not recognize the caller's current OpenTelemetry trace context: the ServiceBus.send span and the outgoing message's traceparent started a new, disconnected trace. This happened because the span was started lazily downstream of the first AMQP connection/link establishment, which runs on a background thread where the caller's thread-local context is not available. The single-message send path now starts the producer message span and the ServiceBus.send span on the subscribing (caller) thread, before the connection thread hop, mirroring the structure already used by the batch send path and Event Hubs. A non-instrumenting overload of sendBatchInternal avoids a duplicate span. Adds a live regression test (sendMessageHasParentSpanOnFirstCall) and a CHANGELOG entry. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thank you for your contribution @ksalazar-91! We will review the pull request and get back to you soon. |
There was a problem hiding this comment.
Pull request overview
Fixes Service Bus tracing so the first sendMessage() call correctly parents ServiceBus.send / ServiceBus.message spans (and injected traceparent) to the caller’s current OpenTelemetry context, avoiding a new/disconnected trace on initial link establishment.
Changes:
- Refactored the single-message send path to start producer/message and send instrumentation at subscription time (before the first AMQP thread hop), and to avoid double-instrumentation.
- Added a live tracing regression test ensuring the first
sendMessage()inherits the caller’s trace id and injects the expectedtraceparent. - Documented the fix in the Service Bus changelog.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| sdk/servicebus/azure-messaging-servicebus/src/main/java/com/azure/messaging/servicebus/ServiceBusSenderAsyncClient.java | Adjusts single-message send pipeline so tracing spans are created on the subscribing thread and avoids duplicate ServiceBus.send spans. |
| sdk/servicebus/azure-messaging-servicebus/src/test/java/com/azure/messaging/servicebus/TracingIntegrationTests.java | Adds a regression test validating correct parent trace propagation on the first sendMessage() call and traceparent injection. |
| sdk/servicebus/azure-messaging-servicebus/CHANGELOG.md | Records the tracing context propagation bug fix for sendMessage() first-call behavior. |
| List<ReadableSpan> send = findSpans(spans, "ServiceBus.send"); | ||
| assertEquals(expectedTraceId, send.get(0).getSpanContext().getTraceId()); | ||
| assertEquals(expectedTraceId, send.get(0).getParentSpanContext().getTraceId()); | ||
|
|
||
| List<ReadableSpan> messageSpans = findSpans(spans, "ServiceBus.message"); | ||
| assertMessageSpan(messageSpans.get(0), message); | ||
| assertEquals(expectedTraceId, messageSpans.get(0).getSpanContext().getTraceId()); | ||
|
|
There was a problem hiding this comment.
Good suggestion - addressed in 3f5fdd7. The test now asserts exactly one ServiceBus.send span and exactly one ServiceBus.message span for a single sendMessage() call, so an accidental double-instrumentation regression would be caught.
…cebus-tracing-44958 # Conflicts: # sdk/servicebus/azure-messaging-servicebus/CHANGELOG.md
Add exact-count assertions (one ServiceBus.send and one ServiceBus.message span per single sendMessage()) so an accidental double-instrumentation regression is caught, per PR review feedback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
The first call to
ServiceBusSenderClient.sendMessage()(andServiceBusSenderAsyncClient.sendMessage()) did not recognize the caller's current OpenTelemetry trace context. TheServiceBus.sendspan and the outgoing message'straceparentstarted a new, disconnected trace instead of being a child of the caller's active span. Subsequent sends were correct.Root cause
The send span was started lazily (inside
Mono.defer) downstream of the first AMQP connection/link establishment. On the first send that work runs on a background AMQP thread, where the caller's thread-local OpenTelemetry context is not available, so the span fell back toContext.current()(empty) and began a new trace. Once the link was cached, later sends started the span on the caller thread and parented correctly.Fix (Service Bus only)
The single-message send path (
sendFluxInternal) now starts the producer message span and theServiceBus.sendspan on the subscribing (caller) thread, before the connection thread hop — mirroring the structure already used by the batch send path and by Event Hubs. A non-instrumenting overload ofsendBatchInternal(instrument=false) is used by this path to avoid a duplicate span.azure-core/azure-core-tracing-opentelemetrychanges.sendMessages) are unchanged.Validation
TracingIntegrationTests.sendMessageHasParentSpanOnFirstCallthat asserts the firstsendMessage()inherits the caller's trace id (span + injectedtraceparent).Fixes #44958