Skip to content

fix: Fixes for OTLP tracing#2268

Open
kensimon wants to merge 2 commits into
NVIDIA:mainfrom
kensimon:tracing-fixes
Open

fix: Fixes for OTLP tracing#2268
kensimon wants to merge 2 commits into
NVIDIA:mainfrom
kensimon:tracing-fixes

Conversation

@kensimon
Copy link
Copy Markdown
Contributor

@kensimon kensimon commented Jun 5, 2026

Description

Recent updates to the opentelemetry crates have broken OTLP tracing, such that all spans are being filtered out. This is because opentelemetry no longer emits code.namespace metadata for spans, meaning CarbideSpanSampler would reject all traces.

Fix this by explicitly setting an attribute carbide.trace_root = true for toplevel spans we want to emit, and have CarbideSpanSampler check for that attribute instead of the code module, which is error-prone, especially when we do any refactoring. Then we rely on ParentBased sampling to include all child spans of that root.

Also, fix an issue where we'd begin a span from within StateController::Builder once at startup and never close it, leading to memory leaks as spans never got closed. Move the span setup to StateProcessor::run_single_iteration instead, so that the span closes at a natural time. Spans all seem to be properly closing themselves now.

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Related Issues (Optional)

#2267

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

Recent updates to the opentelemetry crates have broken OTLP tracing,
such that all spans are being filtered out. This is because
opentelemetry no longer emits `code.namespace` metadata for spans,
meaning CarbideSpanSampler would reject all traces.

Fix this by explicitly setting an attribute `carbide.trace_root = true`
for toplevel spans we want to emit, and have CarbideSpanSampler check
for that attribute instead of the code module, which is error-prone,
especially when we do any refactoring. Then we rely on ParentBased
sampling to include all child spans of that root.

Also, fix an issue where we'd begin a span from within
StateController::Builder once at startup and never close it, leading to
memory leaks as spans never got closed. Move the span setup to
StateProcessor::run_single_iteration instead, so that the span closes at
a natural time. Spans all seem to be properly closing themselves now.
@kensimon kensimon requested a review from a team as a code owner June 5, 2026 21:17
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 5, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f2c1c283-3f1e-4ccb-bf79-eca7b6303479

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

physical_slot_number: None,
revision_id: None,
topology_id: None,
remediation_error: None,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, this was an error on the current main branch, had to include this so it all compiles. (Other PR's are probably having the same issues.)

@kensimon kensimon requested a review from Matthias247 June 5, 2026 21:45
@kensimon
Copy link
Copy Markdown
Contributor Author

kensimon commented Jun 5, 2026

@Matthias247 to have a look at the changes to where the root controller iteration span is emitted.

(We probably need to overhaul the spans so that the individual object processors do their own spans, since a lot of refactoring has happened since these were set up.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant