Skip to content

fix(remote-config): poll for new config after unblocking SIGVTALRM to fix flaky valgrind test#3717

Open
Leiyks wants to merge 1 commit intomasterfrom
leiyks/fix-debugger_enable_dynamic_config_test
Open

fix(remote-config): poll for new config after unblocking SIGVTALRM to fix flaky valgrind test#3717
Leiyks wants to merge 1 commit intomasterfrom
leiyks/fix-debugger_enable_dynamic_config_test

Conversation

@Leiyks
Copy link
Contributor

@Leiyks Leiyks commented Mar 20, 2026

Summary

debugger_enable_dynamic_config.phpt fails intermittently in the valgrind CI pass: foo() returns the script filename instead of "dd.dynamic.span", meaning the span probe was not active when foo() was called.

Root cause: await_probe_installation() relies on ddog_process_remote_configs() being triggered via SIGVTALRM. The dd_handle_signal wrapper blocks/unblocks SIGVTALRM around calls like usleep(), relying on Linux's immediate pending-signal delivery to fire the config check. Under valgrind, SIGVTALRM delivery is unreliable (valgrind uses this signal internally), so the probe occasionally isn't installed within the 15s timeout.

Fix: Remove the #ifndef __linux__ guard in ext/handlers_signal.c so ddtrace_check_for_new_config_now() is always called after sigprocmask(SIG_UNBLOCK). Each usleep() in await_probe_installation() now actively polls for pending config instead of depending on signal delivery. In the normal case, the !reread_remote_configuration guard inside that function prevents double-processing.

@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented Mar 20, 2026

⚠️ Tests

Fix all issues with BitsAI or with Cursor

⚠️ Warnings

🧪 13 Tests failed

testSearchPhpBinaries from integration.DDTrace\Tests\Integration\PHPInstallerTest (Datadog) (Fix with Cursor)
Risky Test
phpvfscomposer://tests/vendor/phpunit/phpunit/phpunit:52
testSimplePushAndProcess from laravel-58-test.DDTrace\Tests\Integrations\Laravel\V5_8\QueueTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integrations\Laravel\V5_8\QueueTest::testSimplePushAndProcess
Test code or tested code printed unexpected output: spanLinksTraceId: 69bd79050000000047a1e054712e07f6
tid: 69bd790500000000
hexProcessTraceId: 47a1e054712e07f6
hexProcessSpanId: 87185067766dbbf3
processTraceId: 5161653301224015862
processSpanId: 9734618999860083699

phpvfscomposer://tests/vendor/phpunit/phpunit/phpunit:106
testSimplePushAndProcess from laravel-8x-test.DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest::testSimplePushAndProcess
Test code or tested code printed unexpected output: spanLinksTraceId: 69bd792300000000fbf20c2b44a45359
tid: 69bd792300000000
hexProcessTraceId: fbf20c2b44a45359
hexProcessSpanId: 79b0fe1edf140a59
processTraceId: 18154586427858637657
processSpanId: 8768787883035462233
View all

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: ef083b8 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

@codecov-commenter
Copy link

codecov-commenter commented Mar 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.73%. Comparing base (17c83a1) to head (ef083b8).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3717      +/-   ##
==========================================
- Coverage   68.74%   68.73%   -0.02%     
==========================================
  Files         166      166              
  Lines       19030    19030              
  Branches     1797     1797              
==========================================
- Hits        13082    13080       -2     
- Misses       5132     5135       +3     
+ Partials      816      815       -1     
Flag Coverage Δ
helper-rust-integration 78.82% <ø> (ø)
helper-rust-unit 49.36% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.
see 1 file with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 17c83a1...ef083b8. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@bwoebi
Copy link
Collaborator

bwoebi commented Mar 20, 2026

What about

#ifdef __linux__
#ifdef HAVE_VALGRIND
# include <valgrind/valgrind.h>
#else
#define RUNNING_ON_VALGRIND 0
#endif
#define REMOTE_CONFIG_NEEDS_POLLING_AFTER_SIGNAL RUNNING_ON_VALGRIND
#else
#define REMOTE_CONFIG_NEEDS_POLLING_AFTER_SIGNAL 1
#endif

and you can put if (REMOTE_CONFIG_NEEDS_POLLING_AFTER_SIGNAL)

…der valgrind

On Linux, unblocking SIGVTALRM causes immediate delivery if it was pending,
which triggers remote config processing via dd_vm_interrupt. However, valgrind
intercepts SIGVTALRM for its own timing, making signal delivery unreliable and
causing the live debugger probe installation to time out in CI valgrind passes.

Introduce REMOTE_CONFIG_NEEDS_POLLING_AFTER_SIGNAL to explicitly call
ddtrace_check_for_new_config_now() after unblocking only when needed:
- Linux + valgrind: RUNNING_ON_VALGRIND (runtime check, zero overhead otherwise)
- Linux without valgrind headers: always 0 (no overhead)
- Non-Linux: always 1 (signal delivery not guaranteed)

Also add AC_CHECK_HEADERS([valgrind/valgrind.h]) to config.m4 to detect
valgrind headers at build time.
@Leiyks Leiyks force-pushed the leiyks/fix-debugger_enable_dynamic_config_test branch from cca1f64 to ef083b8 Compare March 20, 2026 16:33
@pr-commenter
Copy link

pr-commenter bot commented Mar 20, 2026

Benchmarks [ tracer ]

Benchmark execution time: 2026-03-20 17:50:50

Comparing candidate commit ef083b8 in PR branch leiyks/fix-debugger_enable_dynamic_config_test with baseline commit 17c83a1 in branch master.

Found 1 performance improvements and 3 performance regressions! Performance is the same for 188 metrics, 2 unstable metrics.

scenario:ComposerTelemetryBench/benchTelemetryParsing

  • 🟩 execution_time [-1288.901ns; -311.099ns] or [-10.923%; -2.636%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization

  • 🟥 execution_time [+2.616µs; +3.644µs] or [+2.546%; +3.546%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization-opcache

  • 🟥 execution_time [+5.236µs; +7.024µs] or [+5.104%; +6.847%]

scenario:SamplingRuleMatchingBench/benchRegexMatching3

  • 🟥 execution_time [+31.031ns; +120.169ns] or [+2.156%; +8.350%]

@Leiyks Leiyks marked this pull request as ready for review March 20, 2026 16:34
@Leiyks Leiyks requested a review from a team as a code owner March 20, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants