Skip to content

fix(ci): patch run-tests.php getTimer() to return float instead of comma-formatted string#3856

Open
Leiyks wants to merge 5 commits into
masterfrom
leiyks/fix-ci-run-tests-number-format
Open

fix(ci): patch run-tests.php getTimer() to return float instead of comma-formatted string#3856
Leiyks wants to merge 5 commits into
masterfrom
leiyks/fix-ci-run-tests-number-format

Conversation

@Leiyks
Copy link
Copy Markdown
Contributor

@Leiyks Leiyks commented May 5, 2026

What

  • Add a second sed patch to $(BUILD_DIR)/run-tests.php at build time, replacing number_format( with round( inside getTimer()

Why

PHP's run-tests.php getTimer() returns number_format($elapsed, 4), which produces a comma-formatted string (e.g. "1,500.0000") when a test takes ≥1000 seconds. PHP 8.0+ raises E_WARNING: A non-numeric value encountered when that string is used in += arithmetic inside record(). The main test runner process treats this as a fatal worker error and exits with code 2.

This started surfacing ~1 week ago because commit fdbad29 moved DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=0 from ALL_TEST_ENV_OVERRIDE (covers all test targets, including valgrind) into ENV_OVERRIDE (covers only the non-valgrind run). The valgrind run now executes with process tags enabled, making live-debugger and sidecar-related tests 5–20× slower under valgrind's instrumentation, pushing some past the 1000s threshold.

How

We already patch run-tests.php at build time for an unrelated dl() issue. This adds a second patch in the same $(BUILD_DIR)/run-tests.php Makefile target:

sed -i 's/return number_format($$this->rootSuite/return round($$this->rootSuite/' $(BUILD_DIR)/run-tests.php

round($elapsed, 4) returns a float (e.g. 1999.5678) instead of a comma-formatted string, which is safe for += arithmetic. The JUnit XML time='...' attribute also benefits — 1999.5678 is more spec-compliant than 1,999.5678.

Verified the sed pattern matches both PHP 8.4.19 (line 3441) and PHP 8.5.0 (line 3433).

Testing

  • CI: test_extension_ci: [8.4] and test_extension_ci: [8.5] valgrind runs no longer abort with E_WARNING: A non-numeric value encountered in run-tests.php

@datadog-datadog-prod-us1-2
Copy link
Copy Markdown

datadog-datadog-prod-us1-2 Bot commented May 5, 2026

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 60.67% (+0.00%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: e63b1f6 | Docs | Datadog PR Page | Give us feedback!

@Leiyks Leiyks force-pushed the leiyks/fix-ci-run-tests-number-format branch 3 times, most recently from a52010c to fe39d11 Compare May 7, 2026 13:55
@Leiyks Leiyks marked this pull request as ready for review May 7, 2026 15:01
@Leiyks Leiyks requested a review from a team as a code owner May 7, 2026 15:01
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 7, 2026

Benchmarks [ tracer ]

Benchmark execution time: 2026-05-07 15:15:27

Comparing candidate commit fe39d11 in PR branch leiyks/fix-ci-run-tests-number-format with baseline commit 8ff00ed in branch master.

Found 3 performance improvements and 5 performance regressions! Performance is the same for 185 metrics, 1 unstable metrics.

scenario:BM_TeaSapiSpindown

  • 🟩 execution_time [-38.108µs; -17.154µs] or [-6.830%; -3.075%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization

  • 🟩 execution_time [-4.141µs; -3.659µs] or [-3.619%; -3.197%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization-opcache

  • 🟩 execution_time [-5.645µs; -3.575µs] or [-5.150%; -3.261%]

scenario:SamplingRuleMatchingBench/benchRegexMatching1

  • 🟥 execution_time [+36.805ns; +139.395ns] or [+2.493%; +9.440%]

scenario:SamplingRuleMatchingBench/benchRegexMatching2

  • 🟥 execution_time [+45.253ns; +138.147ns] or [+3.056%; +9.329%]

scenario:SamplingRuleMatchingBench/benchRegexMatching3

  • 🟥 execution_time [+69.877ns; +148.123ns] or [+4.804%; +10.183%]

scenario:SamplingRuleMatchingBench/benchRegexMatching4

  • 🟥 execution_time [+30.171ns; +105.429ns] or [+2.025%; +7.074%]

scenario:SpanBench/benchOpenTelemetryInteroperability

  • 🟥 execution_time [+18.966µs; +22.472µs] or [+5.456%; +6.464%]

@Leiyks Leiyks force-pushed the leiyks/fix-ci-run-tests-number-format branch 2 times, most recently from 98fc510 to cb5ca1a Compare May 11, 2026 11:31
Leiyks added 3 commits May 11, 2026 17:22
…= 7.2)

Lists tests slower than 10s with their durations at the end of the run.
The --show-slow flag only exists in PHP 7.2+, so it's gated by a runtime
PHP_VERSION_ID check to avoid breaking PHP 7.0/7.1 jobs.

Used to verify the hypothesis that recent CI failures are caused by tests
exceeding 1000s under valgrind (triggering the number_format E_WARNING
in run-tests.php's record() method).
…lgrind tests

PHP's run-tests.php getTimer() returns number_format($elapsed, 4) which
produces a comma-formatted string (e.g. "1,500.0000") when a test takes
>=1000 seconds. PHP 8.0+ raises E_WARNING when that string is used in +=
arithmetic inside record(), causing the test runner to abort with exit code 2.

Patch getTimer() at build time so it returns a float (via round()) instead.
The CI .base_test before_script runs wait-for-service-ready.sh, but
WAIT_FOR only listed test-agent — request-replayer was never gated on,
even though it is in the agent_httpbin_service() service block used by
test_extension_ci, ASAN test_c, and ASAN test_c with multiple observers.

Add a request-replayer case to detect_service_type (probing /replay, a
read-only endpoint that returns valid JSON when php -S is up), and add
request-replayer:80 to WAIT_FOR for the three affected job blocks.

PHP Language Tests is unchanged: it only declares test-agent in services.
@Leiyks Leiyks force-pushed the leiyks/fix-ci-run-tests-number-format branch from 20b64ea to b356d78 Compare May 11, 2026 15:28
Leiyks added 2 commits May 11, 2026 17:32
curl -sf fails on 4xx, which would falsely report not-ready if /replay
returns 4xx when no dump exists yet (the natural startup state). Drop
-f so any HTTP response proves php -S is up and executing index.php;
connection failure remains the only signal of an unhealthy service.
… to 30s

Two related changes to characterize the remaining test_extension_ci
flake (per-test 300s timeouts on request-replayer-heavy tests):

1. Override DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=0 only for
   the valgrind invocation. fdbad29 removed this override from
   ALL_TEST_ENV_OVERRIDE, so the valgrind run inherited the default
   (enabled) for the first time. If timeouts disappear, the env var
   change is the proximate cause. If they persist, the cause is in
   code paths active regardless of the flag.

2. Bump --show-slow from 10s to 30s so the SLOW TEST SUMMARY stays
   focused on tests that actually approach the 300s per-test timeout,
   trimming noise from the long tail of normally-slow tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant