Fix flaky KafkaClientDataStreamsDisabledForkedTest batch consume test#10797
Fix flaky KafkaClientDataStreamsDisabledForkedTest batch consume test#10797
Conversation
Kafka / producer-benchmarkParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics. See unchanged results
|
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 64 metrics, 7 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.61.0-SNAPSHOT~f10097968b, baseline=1.61.0-SNAPSHOT~9352dfa345
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.063 s) : 0, 1062643
Total [baseline] (8.883 s) : 0, 8882623
Agent [candidate] (1.066 s) : 0, 1066388
Total [candidate] (8.897 s) : 0, 8896926
section iast
Agent [baseline] (1.227 s) : 0, 1226852
Total [baseline] (9.571 s) : 0, 9570764
Agent [candidate] (1.226 s) : 0, 1225992
Total [candidate] (9.6 s) : 0, 9599541
gantt
title insecure-bank - break down per module: candidate=1.61.0-SNAPSHOT~f10097968b, baseline=1.61.0-SNAPSHOT~9352dfa345
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.208 ms) : 0, 1208
crashtracking [candidate] (1.207 ms) : 0, 1207
BytebuddyAgent [baseline] (632.462 ms) : 0, 632462
BytebuddyAgent [candidate] (633.048 ms) : 0, 633048
AgentMeter [baseline] (29.393 ms) : 0, 29393
AgentMeter [candidate] (29.478 ms) : 0, 29478
GlobalTracer [baseline] (258.043 ms) : 0, 258043
GlobalTracer [candidate] (259.457 ms) : 0, 259457
AppSec [baseline] (31.538 ms) : 0, 31538
AppSec [candidate] (31.852 ms) : 0, 31852
Debugger [baseline] (59.444 ms) : 0, 59444
Debugger [candidate] (59.96 ms) : 0, 59960
Remote Config [baseline] (581.363 µs) : 0, 581
Remote Config [candidate] (584.55 µs) : 0, 585
Telemetry [baseline] (8.034 ms) : 0, 8034
Telemetry [candidate] (8.738 ms) : 0, 8738
Flare Poller [baseline] (5.76 ms) : 0, 5760
Flare Poller [candidate] (5.797 ms) : 0, 5797
section iast
crashtracking [baseline] (1.216 ms) : 0, 1216
crashtracking [candidate] (1.186 ms) : 0, 1186
BytebuddyAgent [baseline] (795.929 ms) : 0, 795929
BytebuddyAgent [candidate] (795.283 ms) : 0, 795283
AgentMeter [baseline] (11.341 ms) : 0, 11341
AgentMeter [candidate] (11.274 ms) : 0, 11274
GlobalTracer [baseline] (247.486 ms) : 0, 247486
GlobalTracer [candidate] (247.231 ms) : 0, 247231
AppSec [baseline] (26.488 ms) : 0, 26488
AppSec [candidate] (26.453 ms) : 0, 26453
Debugger [baseline] (69.136 ms) : 0, 69136
Debugger [candidate] (70.052 ms) : 0, 70052
Remote Config [baseline] (528.454 µs) : 0, 528
Remote Config [candidate] (533.945 µs) : 0, 534
Telemetry [baseline] (9.722 ms) : 0, 9722
Telemetry [candidate] (9.212 ms) : 0, 9212
Flare Poller [baseline] (3.494 ms) : 0, 3494
Flare Poller [candidate] (3.414 ms) : 0, 3414
IAST [baseline] (25.316 ms) : 0, 25316
IAST [candidate] (25.299 ms) : 0, 25299
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.61.0-SNAPSHOT~f10097968b, baseline=1.61.0-SNAPSHOT~9352dfa345
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.062 s) : 0, 1061900
Total [baseline] (11.125 s) : 0, 11125064
Agent [candidate] (1.061 s) : 0, 1061302
Total [candidate] (11.056 s) : 0, 11055821
section appsec
Agent [baseline] (1.249 s) : 0, 1248696
Total [baseline] (11.117 s) : 0, 11116621
Agent [candidate] (1.246 s) : 0, 1246224
Total [candidate] (11.083 s) : 0, 11082961
section iast
Agent [baseline] (1.239 s) : 0, 1239293
Total [baseline] (11.403 s) : 0, 11403024
Agent [candidate] (1.241 s) : 0, 1241336
Total [candidate] (11.316 s) : 0, 11315630
section profiling
Agent [baseline] (1.185 s) : 0, 1184835
Total [baseline] (11.071 s) : 0, 11070964
Agent [candidate] (1.184 s) : 0, 1183520
Total [candidate] (11.029 s) : 0, 11029413
gantt
title petclinic - break down per module: candidate=1.61.0-SNAPSHOT~f10097968b, baseline=1.61.0-SNAPSHOT~9352dfa345
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.186 ms) : 0, 1186
crashtracking [candidate] (1.207 ms) : 0, 1207
BytebuddyAgent [baseline] (630.262 ms) : 0, 630262
BytebuddyAgent [candidate] (630.836 ms) : 0, 630836
AgentMeter [baseline] (29.425 ms) : 0, 29425
AgentMeter [candidate] (29.319 ms) : 0, 29319
GlobalTracer [baseline] (259.05 ms) : 0, 259050
GlobalTracer [candidate] (257.889 ms) : 0, 257889
AppSec [baseline] (31.991 ms) : 0, 31991
AppSec [candidate] (31.788 ms) : 0, 31788
Debugger [baseline] (60.904 ms) : 0, 60904
Debugger [candidate] (60.57 ms) : 0, 60570
Remote Config [baseline] (593.059 µs) : 0, 593
Remote Config [candidate] (589.611 µs) : 0, 590
Telemetry [baseline] (8.068 ms) : 0, 8068
Telemetry [candidate] (8.801 ms) : 0, 8801
Flare Poller [baseline] (4.288 ms) : 0, 4288
Flare Poller [candidate] (4.243 ms) : 0, 4243
section appsec
crashtracking [baseline] (1.19 ms) : 0, 1190
crashtracking [candidate] (1.178 ms) : 0, 1178
BytebuddyAgent [baseline] (658.448 ms) : 0, 658448
BytebuddyAgent [candidate] (657.174 ms) : 0, 657174
AgentMeter [baseline] (12.012 ms) : 0, 12012
AgentMeter [candidate] (12.014 ms) : 0, 12014
GlobalTracer [baseline] (258.78 ms) : 0, 258780
GlobalTracer [candidate] (258.437 ms) : 0, 258437
AppSec [baseline] (178.35 ms) : 0, 178350
AppSec [candidate] (177.933 ms) : 0, 177933
Debugger [baseline] (66.679 ms) : 0, 66679
Debugger [candidate] (66.384 ms) : 0, 66384
Remote Config [baseline] (616.15 µs) : 0, 616
Remote Config [candidate] (619.901 µs) : 0, 620
Telemetry [baseline] (8.417 ms) : 0, 8417
Telemetry [candidate] (8.356 ms) : 0, 8356
Flare Poller [baseline] (3.628 ms) : 0, 3628
Flare Poller [candidate] (3.652 ms) : 0, 3652
IAST [baseline] (24.251 ms) : 0, 24251
IAST [candidate] (24.166 ms) : 0, 24166
section iast
crashtracking [baseline] (1.228 ms) : 0, 1228
crashtracking [candidate] (1.214 ms) : 0, 1214
BytebuddyAgent [baseline] (803.864 ms) : 0, 803864
BytebuddyAgent [candidate] (805.692 ms) : 0, 805692
AgentMeter [baseline] (11.625 ms) : 0, 11625
AgentMeter [candidate] (11.654 ms) : 0, 11654
GlobalTracer [baseline] (249.51 ms) : 0, 249510
GlobalTracer [candidate] (249.998 ms) : 0, 249998
AppSec [baseline] (26.842 ms) : 0, 26842
AppSec [candidate] (27.071 ms) : 0, 27071
Debugger [baseline] (71.106 ms) : 0, 71106
Debugger [candidate] (70.657 ms) : 0, 70657
Remote Config [baseline] (536.84 µs) : 0, 537
Remote Config [candidate] (525.952 µs) : 0, 526
Telemetry [baseline] (9.245 ms) : 0, 9245
Telemetry [candidate] (9.157 ms) : 0, 9157
Flare Poller [baseline] (3.357 ms) : 0, 3357
Flare Poller [candidate] (3.337 ms) : 0, 3337
IAST [baseline] (25.605 ms) : 0, 25605
IAST [candidate] (25.734 ms) : 0, 25734
section profiling
crashtracking [baseline] (1.173 ms) : 0, 1173
crashtracking [candidate] (1.209 ms) : 0, 1209
BytebuddyAgent [baseline] (684.02 ms) : 0, 684020
BytebuddyAgent [candidate] (683.527 ms) : 0, 683527
AgentMeter [baseline] (8.63 ms) : 0, 8630
AgentMeter [candidate] (8.651 ms) : 0, 8651
GlobalTracer [baseline] (215.584 ms) : 0, 215584
GlobalTracer [candidate] (215.643 ms) : 0, 215643
AppSec [baseline] (32.211 ms) : 0, 32211
AppSec [candidate] (32.197 ms) : 0, 32197
Debugger [baseline] (66.071 ms) : 0, 66071
Debugger [candidate] (65.767 ms) : 0, 65767
Remote Config [baseline] (568.535 µs) : 0, 569
Remote Config [candidate] (559.235 µs) : 0, 559
Telemetry [baseline] (7.742 ms) : 0, 7742
Telemetry [candidate] (7.712 ms) : 0, 7712
Flare Poller [baseline] (3.456 ms) : 0, 3456
Flare Poller [candidate] (3.442 ms) : 0, 3442
ProfilingAgent [baseline] (94.565 ms) : 0, 94565
ProfilingAgent [candidate] (94.048 ms) : 0, 94048
Profiling [baseline] (95.137 ms) : 0, 95137
Profiling [candidate] (94.607 ms) : 0, 94607
LoadParameters
See matching parameters
SummaryFound 1 performance improvements and 1 performance regressions! Performance is the same for 19 metrics, 15 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~f10097968b, baseline=1.61.0-SNAPSHOT~9352dfa345
dateFormat X
axisFormat %s
section baseline
no_agent (18.426 ms) : 18238, 18614
. : milestone, 18426,
appsec (18.433 ms) : 18248, 18618
. : milestone, 18433,
code_origins (17.957 ms) : 17776, 18138
. : milestone, 17957,
iast (18.179 ms) : 17995, 18362
. : milestone, 18179,
profiling (18.413 ms) : 18228, 18598
. : milestone, 18413,
tracing (17.753 ms) : 17574, 17932
. : milestone, 17753,
section candidate
no_agent (17.225 ms) : 17051, 17399
. : milestone, 17225,
appsec (18.339 ms) : 18156, 18522
. : milestone, 18339,
code_origins (18.003 ms) : 17823, 18183
. : milestone, 18003,
iast (17.954 ms) : 17775, 18132
. : milestone, 17954,
profiling (19.464 ms) : 19273, 19656
. : milestone, 19464,
tracing (18.554 ms) : 18364, 18745
. : milestone, 18554,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~f10097968b, baseline=1.61.0-SNAPSHOT~9352dfa345
dateFormat X
axisFormat %s
section baseline
no_agent (1.17 ms) : 1159, 1181
. : milestone, 1170,
iast (3.053 ms) : 3013, 3092
. : milestone, 3053,
iast_FULL (5.921 ms) : 5861, 5981
. : milestone, 5921,
iast_GLOBAL (3.431 ms) : 3379, 3482
. : milestone, 3431,
profiling (2.187 ms) : 2165, 2209
. : milestone, 2187,
tracing (1.821 ms) : 1804, 1837
. : milestone, 1821,
section candidate
no_agent (1.178 ms) : 1167, 1190
. : milestone, 1178,
iast (3.156 ms) : 3109, 3202
. : milestone, 3156,
iast_FULL (5.863 ms) : 5803, 5923
. : milestone, 5863,
iast_GLOBAL (3.529 ms) : 3471, 3587
. : milestone, 3529,
profiling (2.222 ms) : 2200, 2244
. : milestone, 2222,
tracing (1.785 ms) : 1770, 1800
. : milestone, 1785,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 2 unstable metrics. Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~f10097968b, baseline=1.61.0-SNAPSHOT~9352dfa345
dateFormat X
axisFormat %s
section baseline
no_agent (14.783 s) : 14783000, 14783000
. : milestone, 14783000,
appsec (14.687 s) : 14687000, 14687000
. : milestone, 14687000,
iast (18.642 s) : 18642000, 18642000
. : milestone, 18642000,
iast_GLOBAL (18.301 s) : 18301000, 18301000
. : milestone, 18301000,
profiling (15.015 s) : 15015000, 15015000
. : milestone, 15015000,
tracing (14.886 s) : 14886000, 14886000
. : milestone, 14886000,
section candidate
no_agent (15.496 s) : 15496000, 15496000
. : milestone, 15496000,
appsec (14.724 s) : 14724000, 14724000
. : milestone, 14724000,
iast (18.384 s) : 18384000, 18384000
. : milestone, 18384000,
iast_GLOBAL (18.027 s) : 18027000, 18027000
. : milestone, 18027000,
profiling (15.165 s) : 15165000, 15165000
. : milestone, 15165000,
tracing (14.779 s) : 14779000, 14779000
. : milestone, 14779000,
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~f10097968b, baseline=1.61.0-SNAPSHOT~9352dfa345
dateFormat X
axisFormat %s
section baseline
no_agent (1.47 ms) : 1459, 1481
. : milestone, 1470,
appsec (3.728 ms) : 3511, 3944
. : milestone, 3728,
iast (2.25 ms) : 2181, 2319
. : milestone, 2250,
iast_GLOBAL (2.29 ms) : 2221, 2359
. : milestone, 2290,
profiling (2.504 ms) : 2277, 2731
. : milestone, 2504,
tracing (2.049 ms) : 1996, 2102
. : milestone, 2049,
section candidate
no_agent (1.475 ms) : 1463, 1486
. : milestone, 1475,
appsec (3.814 ms) : 3594, 4034
. : milestone, 3814,
iast (2.249 ms) : 2180, 2317
. : milestone, 2249,
iast_GLOBAL (2.29 ms) : 2221, 2359
. : milestone, 2290,
profiling (2.486 ms) : 2334, 2638
. : milestone, 2486,
tracing (2.06 ms) : 2006, 2113
. : milestone, 2060,
|
Kafka / consumer-benchmarkParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics. See unchanged results
|
79b8bde to
106ceb0
Compare
106ceb0 to
dbe89c4
Compare
The test mapped consumer traces to producer spans by positional index after SORT_TRACES_BY_ID sorting. Since trace IDs are random, the consumer-to-producer mapping was non-deterministic, causing intermittent `span.parentId == parent.spanId` assertion failures. Fix by dynamically finding each consumer span's actual parent producer span via parentId matching instead of relying on sort order. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use SEQUENTIAL id.generation.strategy in the DSM-disabled Kafka test to force a deterministic sort order for SORT_TRACES_BY_ID. Sequential IDs sort traces in creation order, which differs from the reverse mapping the original positional code assumed. This proves the dynamic parent lookup fix handles any trace ordering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…test
Remove injectSysConfig("id.generation.strategy", "SEQUENTIAL") which did not
actually trigger the flake. Add KafkaClientDsmDisabledRandomIdsForkedTest that
overrides idGenerationStrategyName() to "RANDOM", matching production behavior.
With RANDOM IDs, SORT_TRACES_BY_ID produces non-deterministic order, causing
the original positional consumer-to-producer mapping to fail ~95% of the time.
Switch the batch consume test to SORT_TRACES_BY_START so the parent trace
(started before any consumer receives messages) is always at index 0. The
dynamic parent lookup fix handles any ordering of the 3 consumer traces.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
c6cb1d4 to
45636bc
Compare
|
|
||
| /** | ||
| * Reproduces the flake in "test spring kafka template produce and batch consume" | ||
| * by using RANDOM IDs (instead of the default SEQUENTIAL used in tests). | ||
| * | ||
| * Root cause: The test's assertTraces(4, SORT_TRACES_BY_ID) sorts traces by | ||
| * localRootSpan.spanId, then hardcodes positional mappings between consumer and | ||
| * producer traces. With SEQUENTIAL IDs (the test default), both the producer span | ||
| * finish order within trace(0) and the consumer trace sort order are driven by the | ||
| * same Kafka internal ordering, so the mapping happens to be consistent. | ||
| * | ||
| * With RANDOM IDs (as used in production), the sort order becomes non-deterministic. | ||
| * There are 3! = 6 possible orderings for the 3 consumer traces, and only 1 matches | ||
| * the hardcoded mapping. The dynamic parent lookup fix handles any ordering. | ||
| */ | ||
| class KafkaClientDsmDisabledRandomIdsForkedTest extends KafkaClientDataStreamsDisabledForkedTest { | ||
| @Override | ||
| protected String idGenerationStrategyName() { | ||
| return "RANDOM" | ||
| } | ||
| } |
There was a problem hiding this comment.
| /** | |
| * Reproduces the flake in "test spring kafka template produce and batch consume" | |
| * by using RANDOM IDs (instead of the default SEQUENTIAL used in tests). | |
| * | |
| * Root cause: The test's assertTraces(4, SORT_TRACES_BY_ID) sorts traces by | |
| * localRootSpan.spanId, then hardcodes positional mappings between consumer and | |
| * producer traces. With SEQUENTIAL IDs (the test default), both the producer span | |
| * finish order within trace(0) and the consumer trace sort order are driven by the | |
| * same Kafka internal ordering, so the mapping happens to be consistent. | |
| * | |
| * With RANDOM IDs (as used in production), the sort order becomes non-deterministic. | |
| * There are 3! = 6 possible orderings for the 3 consumer traces, and only 1 matches | |
| * the hardcoded mapping. The dynamic parent lookup fix handles any ordering. | |
| */ | |
| class KafkaClientDsmDisabledRandomIdsForkedTest extends KafkaClientDataStreamsDisabledForkedTest { | |
| @Override | |
| protected String idGenerationStrategyName() { | |
| return "RANDOM" | |
| } | |
| } |
I'd suggest leaving it in since it provides a more representative test case, but I'm happy to remove it if folks think otherwise
Update 6 more test methods that use SORT_TRACES_BY_ID with hardcoded positional trace references to use SORT_TRACES_BY_START, so they work with the KafkaClientDsmDisabledRandomIdsForkedTest that uses RANDOM span IDs. For the backwards iteration test with 9 traces, also use dynamic parent matching since consumer trace ordering may vary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
What Does This Do
Fixes flaky Kafka tests in
KafkaClientDataStreamsDisabledForkedTestby replacingSORT_TRACES_BY_IDwithSORT_TRACES_BY_STARTand using dynamic parent lookup instead of hardcoded positional trace references.Motivation
Multiple test methods use
SORT_TRACES_BY_IDwhich sorts traces by their root span'sspanId. WithSEQUENTIALID generation (the test default), this happens to match creation order. WithRANDOMIDs (production behavior), the sort order is non-deterministic, breaking hardcoded consumer-to-producer trace mappings.Root Cause
SORT_TRACES_BY_IDsorts by span ID, which is only deterministic with sequential IDs. The test hardcodes positional mappings like trace(1)→trace(0)[6] that assume a specific sort order. With RANDOM IDs, these mappings break.Fix
SORT_TRACES_BY_START: Producer traces always start before consumer traces, giving deterministic ordering by start time regardless of ID strategy.parentIdto the correct producer span instead of assuming positional indices.KafkaClientDsmDisabledRandomIdsForkedTest: New test class that uses RANDOM IDs to reproduce the issue deterministically.Affected Test Methods
test spring kafka template produce and batch consume— dynamic parent matchingtest spring kafka template produce and consume— SORT_TRACES_BY_STARTtest pass through tombstone— SORT_TRACES_BY_STARTtest records(TopicPartition) kafka consume— SORT_TRACES_BY_STARTtest records(TopicPartition).subList kafka consume— SORT_TRACES_BY_STARTtest records(TopicPartition).forEach kafka consume— SORT_TRACES_BY_STARTtest iteration backwards over ConsumerRecords— SORT_TRACES_BY_START + dynamic parent matchingAdditional Notes
!hasQueueSpan()branches use dynamic matching (thehasQueueSpan()branches are unchanged)Jira ticket: N/A
🤖 Generated with Claude Code