Skip to content

Add _dd.p.ksr propagated tag for Knuth sampling rate#3701

Merged
bm1549 merged 7 commits intomasterfrom
brian.marks/add-ksr-tag
Mar 17, 2026
Merged

Add _dd.p.ksr propagated tag for Knuth sampling rate#3701
bm1549 merged 7 commits intomasterfrom
brian.marks/add-ksr-tag

Conversation

@bm1549
Copy link
Contributor

@bm1549 bm1549 commented Mar 11, 2026

Description

Adds _dd.p.ksr (Knuth Sampling Rate) as a propagated tag set when agent-based or rule-based sampling decisions are made. The tag is stored in span meta (string type) with up to 6 significant digits and no trailing zeros.

To enable consistent sampling across tracers and backend retention filters, the backend needs to know the sampling rate applied by the tracer. Without transmitting the tracer's rate via _dd.p.ksr, backend resampling cannot correctly compute effective rates in multi-stage sampling scenarios.

See RFC: "Transmit Knuth sampling rate to backend"

Key files changed:

  • ext/priority_sampling/priority_sampling.c — Added dd_update_knuth_sampling_rate_tag() function; deletes _dd.p.ksr in all code paths that don't set it (manual, propagated-mode early return, default mechanism)
  • 4 phpt test files for rule sampling, default sampling, manual exclusion, and formatting

Related PRs across tracers:

Reviewer checklist

  • Test coverage seems ok.
  • Appropriate labels assigned.

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@datadog-official
Copy link

datadog-official bot commented Mar 11, 2026

⚠️ Tests

Fix all issues with BitsAI or with Cursor

⚠️ Warnings

❄️ 8 New flaky tests detected

tmp/build_extension/tests/ext/priority_sampling/025-ksr-tag-rule-sampling.phpt (_dd.p.ksr propagated tag is set for rule-based sampling) from PHP.tmp.build_extension.tests.ext.priority_sampling (Datadog) (Fix with Cursor)
002+ 
004+ 
006+ NULL
007+ _dd.p.ksr = -
001- Rule OK
002- _dd.p.ksr = 0.3
tmp/build_extension/tests/ext/priority_sampling/026-ksr-tag-default-sampling.phpt (_dd.p.ksr propagated tag is NOT set for default sampling (only for explicit agent rates)) from PHP.tmp.build_extension.tests.ext.priority_sampling (Datadog) (Fix with Cursor)
002+ 
004+ Agent PSR missing
001- Agent PSR OK
003- _dd.p.dm = -0
006+ 
008+ 
010+ _dd.p.dm =
tmp/build_extension/tests/ext/priority_sampling/027-ksr-tag-not-set-manual.phpt (_dd.p.ksr propagated tag is NOT set for manual sampling) from PHP.tmp.build_extension.tests.ext.priority_sampling (Datadog) (Fix with Cursor)
003- _dd.p.dm = -4
004+ 
006+ _dd.p.dm =
View all

🧪 13 Tests failed

testSearchPhpBinaries from integration.DDTrace\Tests\Integration\PHPInstallerTest (Datadog) (Fix with Cursor)
Risky Test
phpvfscomposer://tests/vendor/phpunit/phpunit/phpunit:52
testSimplePushAndProcess from laravel-58-test.DDTrace\Tests\Integrations\Laravel\V5_8\QueueTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integrations\Laravel\V5_8\QueueTest::testSimplePushAndProcess
Test code or tested code printed unexpected output: spanLinksTraceId: 69b8acaa000000004cc97e44a0aab238
tid: 69b8acaa00000000
hexProcessTraceId: 4cc97e44a0aab238
hexProcessSpanId: 8722a7fac3959b7d
processTraceId: 5533092450419782200
processSpanId: 9737530039793458045

phpvfscomposer://tests/vendor/phpunit/phpunit/phpunit:106
testSimplePushAndProcess from laravel-8x-test.DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest::testSimplePushAndProcess
Test code or tested code printed unexpected output: spanLinksTraceId: 69b8ae0100000000471e325712764674
tid: 69b8ae0100000000
hexProcessTraceId: 471e325712764674
hexProcessSpanId: 9e5fc536b42de82f
processTraceId: 5124588775547487860
processSpanId: 11412056819521939503

phpvfscomposer://tests/vendor/phpunit/phpunit/phpunit:106
View all
This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 96e7dc0 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

The default sampling mechanism (DD_MECHANISM_DEFAULT) is not an agent
rate. Per the RFC, _dd.p.ksr should only be set for "Agent Sampling
rate or Trace Sampling rules". This prevents _dd.p.ksr from appearing
on every span when no agent rates are configured.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@codecov-commenter
Copy link

codecov-commenter commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.35%. Comparing base (42d0acc) to head (96e7dc0).
⚠️ Report is 13 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3701      +/-   ##
==========================================
- Coverage   62.40%   62.35%   -0.06%     
==========================================
  Files         142      142              
  Lines       13586    13586              
  Branches     1775     1775              
==========================================
- Hits         8479     8471       -8     
- Misses       4301     4308       +7     
- Partials      806      807       +1     

see 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 42d0acc...96e7dc0. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

bm1549 and others added 2 commits March 10, 2026 22:35
The inferred proxy sampling rules test uses explicit DD_TRACE_SAMPLING_RULES
which triggers rule-based sampling. The ksr tag now appears in span meta.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two fixes for remaining CI failures after the initial KSR tag implementation:

1. serializer.c: Copy _dd.p.ksr from the root span to the inferred proxy
   span during serialization (using false to keep it on the root span too),
   matching the behavior expected by sampling_rules.phpt.

2. SpanChecker.php: Auto-ignore _dd.p.ksr in withExactTags() assertions
   unless the test explicitly tests for it, consistent with how _dd.p.dm
   and _dd.p.tid are handled. This fixes ResponseStatusCodeTest and other
   integration tests that don't test KSR behavior directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@pr-commenter
Copy link

pr-commenter bot commented Mar 11, 2026

Benchmarks [ tracer ]

Benchmark execution time: 2026-03-17 02:32:16

Comparing candidate commit 96e7dc0 in PR branch brian.marks/add-ksr-tag with baseline commit 42d0acc in branch master.

Found 0 performance improvements and 23 performance regressions! Performance is the same for 170 metrics, 1 unstable metrics.

scenario:ComposerTelemetryBench/benchTelemetryParsing-opcache

  • 🟥 execution_time [+297.667ns; +1502.333ns] or [+2.401%; +12.116%]

scenario:ContextPropagationBench/benchInject64Bit

  • 🟥 execution_time [+249.030ns; +334.970ns] or [+12.870%; +17.311%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization-opcache

  • 🟥 execution_time [+2.945µs; +4.095µs] or [+2.799%; +3.893%]

scenario:SamplingRuleMatchingBench/benchGlobMatching1

  • 🟥 execution_time [+133.293ns; +245.107ns] or [+6.027%; +11.082%]

scenario:SamplingRuleMatchingBench/benchGlobMatching1-opcache

  • 🟥 execution_time [+133.148ns; +266.052ns] or [+5.709%; +11.407%]

scenario:SamplingRuleMatchingBench/benchGlobMatching2

  • 🟥 execution_time [+172.123ns; +302.677ns] or [+7.524%; +13.230%]

scenario:SamplingRuleMatchingBench/benchGlobMatching2-opcache

  • 🟥 execution_time [+179.902ns; +279.498ns] or [+7.524%; +11.690%]

scenario:SamplingRuleMatchingBench/benchGlobMatching3

  • 🟥 execution_time [+186.449ns; +287.751ns] or [+7.961%; +12.287%]

scenario:SamplingRuleMatchingBench/benchGlobMatching3-opcache

  • 🟥 execution_time [+171.258ns; +281.942ns] or [+6.940%; +11.425%]

scenario:SamplingRuleMatchingBench/benchGlobMatching4

  • 🟥 execution_time [+164.433ns; +295.567ns] or [+7.009%; +12.599%]

scenario:SamplingRuleMatchingBench/benchGlobMatching4-opcache

  • 🟥 execution_time [+189.305ns; +273.295ns] or [+7.662%; +11.061%]

scenario:SamplingRuleMatchingBench/benchRegexMatching1

  • 🟥 execution_time [+178.741ns; +236.259ns] or [+14.245%; +18.828%]

scenario:SamplingRuleMatchingBench/benchRegexMatching1-opcache

  • 🟥 execution_time [+164.432ns; +251.368ns] or [+11.879%; +18.160%]

scenario:SamplingRuleMatchingBench/benchRegexMatching2

  • 🟥 execution_time [+162.462ns; +216.938ns] or [+12.942%; +17.282%]

scenario:SamplingRuleMatchingBench/benchRegexMatching2-opcache

  • 🟥 execution_time [+167.580ns; +231.620ns] or [+12.094%; +16.715%]

scenario:SamplingRuleMatchingBench/benchRegexMatching3

  • 🟥 execution_time [+167.398ns; +222.202ns] or [+13.183%; +17.499%]

scenario:SamplingRuleMatchingBench/benchRegexMatching3-opcache

  • 🟥 execution_time [+159.645ns; +227.155ns] or [+11.334%; +16.126%]

scenario:SamplingRuleMatchingBench/benchRegexMatching4

  • 🟥 execution_time [+166.791ns; +215.809ns] or [+13.289%; +17.195%]

scenario:SamplingRuleMatchingBench/benchRegexMatching4-opcache

  • 🟥 execution_time [+146.550ns; +237.050ns] or [+10.471%; +16.937%]

scenario:SpanBench/benchOpenTelemetryAPI

  • 🟥 mem_peak [+4.495MB; +4.495MB] or [+10.399%; +10.399%]

scenario:SpanBench/benchOpenTelemetryAPI-opcache

  • 🟥 mem_peak [+4.493MB; +4.493MB] or [+11.174%; +11.174%]

scenario:SpanBench/benchOpenTelemetryInteroperability

  • 🟥 execution_time [+7.052µs; +9.410µs] or [+3.887%; +5.186%]

scenario:SpanBench/benchOpenTelemetryInteroperability-opcache

  • 🟥 execution_time [+4.711µs; +7.683µs] or [+2.799%; +4.565%]

In the hot sampling path, avoid re-allocating strings and updating
hash tables when _dd.p.ksr already contains the same value. This
optimization recovers most of the performance regression seen in
SamplingRuleMatchingBench when sampling is re-evaluated with the
same rate (the common case for long-running spans).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
zend_hash_str_del(metrics, ZEND_STRL("_dd.rule_psr"));
} else {
zend_hash_str_update(metrics, ZEND_STRL("_dd.rule_psr"), &sample_rate_zv);
dd_update_knuth_sampling_rate_tag(span, sample_rate);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that the _dd.p.ksr is not deleted where the psr is deleted?

#include <json/json.h>

#include "../configuration.h"
#include "../tracer_tag_propagation/tracer_tag_propagation.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant header inclusion.

bm1549 and others added 2 commits March 13, 2026 15:03
- Remove redundant tracer_tag_propagation.h include (already available via span.h)
- Delete _dd.p.ksr from meta in manual mechanism path to prevent stale tags
- Add static to KNUTH_FACTOR and MAX_TRACE_ID constants (file-local only)
- Use --EXPECT-- instead of --EXPECTREGEX-- in test 028 (deterministic output)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Delete the ksr tag in the propagated-mode early return and the
DD_MECHANISM_DEFAULT branch, not just DD_MECHANISM_MANUAL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 marked this pull request as ready for review March 17, 2026 01:20
@bm1549 bm1549 requested a review from a team as a code owner March 17, 2026 01:20
@bm1549 bm1549 requested a review from bwoebi March 17, 2026 01:20
Copy link
Collaborator

@bwoebi bwoebi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now, Brian, thanks!

@bm1549 bm1549 merged commit 8d49177 into master Mar 17, 2026
2060 of 2062 checks passed
@bm1549 bm1549 deleted the brian.marks/add-ksr-tag branch March 17, 2026 13:43
@github-actions github-actions bot added this to the 1.17.0 milestone Mar 17, 2026
bm1549 added a commit to DataDog/dd-trace-rs that referenced this pull request Mar 19, 2026
# What does this PR do?

Adds `_dd.p.ksr` (Knuth Sampling Rate) as a propagated tag set when
agent-based or rule-based sampling decisions are made. The tag is stored
in span `meta` (string type) with up to 6 significant digits and no
trailing zeros.

`format_sampling_rate` now returns `Option<String>` and guards against
invalid inputs (negative, >1.0, NaN, infinity), returning `None` instead
of producing garbage output.

# Motivation

To enable consistent sampling across tracers and backend retention
filters, the backend needs to know the sampling rate applied by the
tracer. Without transmitting the tracer's rate via `_dd.p.ksr`, backend
resampling cannot correctly compute effective rates in multi-stage
sampling scenarios.

See RFC: "Transmit Knuth sampling rate to backend"

# Additional Notes

Key files changed:
- `datadog-opentelemetry/src/core/constants.rs` — Added
`SAMPLING_KNUTH_RATE_TAG_KEY` constant
- `datadog-opentelemetry/src/sampling/datadog_sampler.rs` — Added
`format_sampling_rate()` helper (returns `Option<String>`, defensive
against invalid rates) and set ksr in agent/rule sampling paths
- Updated 2 snapshot JSON files

Related PRs across tracers:
- Java: DataDog/dd-trace-java#10802
- .NET: DataDog/dd-trace-dotnet#8287
- Ruby: DataDog/dd-trace-rb#5436
- Node.js: DataDog/dd-trace-js#7741
- PHP: DataDog/dd-trace-php#3701
- C++: DataDog/dd-trace-cpp#288
- System tests: DataDog/system-tests#6466

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
bm1549 added a commit to DataDog/dd-trace-go that referenced this pull request Mar 19, 2026
### What does this PR do?

Fixes `_dd.p.ksr` (Knuth Sampling Rate) to only be set on spans when the
agent has provided sampling rates via `readRatesJSON()`. Previously, ksr
was unconditionally set in `prioritySampler.apply()`, including when the
rate was the initial client-side default (1.0) before any agent response
arrived.

Also refactors `prioritySampler` to consolidate lock acquisitions:
extracts `getRateLocked()` so `apply()` acquires `ps.mu.RLock` only once
to read both the rate and `agentRatesLoaded`.

### Motivation

Cross-language consistency: Python, Java, PHP, and other tracers only
set ksr when actual agent rates or sampling rules are applied, not for
the default fallback. This aligns Go with that behavior.

See RFC: "Transmit Knuth sampling rate to backend"

### Additional Notes

- Added `agentRatesLoaded` bool field to `prioritySampler`, set to
`true` in `readRatesJSON()`
- `apply()` now gates ksr behind `agentRatesLoaded` check
- Extracted `getRateLocked()` to avoid double lock acquisition in
`apply()`
- Rule-based sampling path (`applyTraceRuleSampling` in span.go)
unchanged — correctly always sets ksr
- Tests added: `ksr-not-set-without-agent-rates` and
`ksr-set-after-agent-rates-received`

Related PRs across tracers:
- Java: DataDog/dd-trace-java#10802
- .NET: DataDog/dd-trace-dotnet#8287
- Ruby: DataDog/dd-trace-rb#5436
- Node.js: DataDog/dd-trace-js#7741
- PHP: DataDog/dd-trace-php#3701
- Rust: DataDog/dd-trace-rs#180
- C++: DataDog/dd-trace-cpp#288
- System tests: DataDog/system-tests#6466

### Reviewer's Checklist

- [x] Changed code has unit tests for its functionality at or near 100%
coverage.
- [x] [System-Tests](https://github.com/DataDog/system-tests/) covering
this feature have been added and enabled with the va.b.c-dev version
tag.
- [ ] There is a benchmark for any new code, or changes to existing
code.
- [x] If this interacts with the agent in a new way, a system test has
been added.
- [x] New code is free of linting errors. You can check this by running
`make lint` locally.
- [x] New code doesn't break existing tests. You can check this by
running `make test` locally.
- [ ] Add an appropriate team label so this PR gets put in the right
place for the release notes.
- [ ] All generated files are up to date. You can check this by running
`make generate` locally.
- [ ] Non-trivial go.mod changes, e.g. adding new modules, are reviewed
by @DataDog/dd-trace-go-guild. Make sure all nested modules are up to
date by running `make fix-modules` locally.

Unsure? Have a question? Request a review!

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Dario Castañé <dario.castane@datadoghq.com>
Co-authored-by: Mikayla Toffler <46911781+mtoffl01@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants