fix(llmobs): openai-java payload mapping for responses, tool metadata, and prompt tracking by ygree · Pull Request #10644 · DataDog/dd-trace-java

ygree · 2026-02-19T21:45:14Z

What Does This Do

Aligns OpenAI Java LLMObs span payloads with expected intake/system-test schema by:

Adding/filling missing LLMObs tags:
- _ml_obs_tag.integration
- _ml_obs_tag.source
- _ml_obs_tag.ddtrace.version
- _ml_obs_tag.error
- _ml_obs_tag.error_type
Ensuring model_name (and stable placeholder output where applicable) is set on error paths for
chat/completions/embeddings/responses.
Expanding Responses instrumentation:
- prompt tracking (input.prompt, variables, chat_template)
- tool definition extraction (tool_definitions)
- tool call/result extraction across function/custom/MCP outputs
- metadata normalization (stream, tool_choice, text.verbosity, etc.)
Refactoring JSON conversion via shared JsonValueUtils.
Updating LLMObs mapper payload shape:
- writes _dd map with span/trace ids
- nests error fields under meta.error
- supports map-based LLM input serialization (messages + prompt)
- remaps tool_definitions into meta.
Updating tests to add value-level assertions for the above behavior.

Motivation

OpenAI/LLMObs system tests exposed schema and tag mismatches in Java payloads (especially response spans, tool metadata, error mapping, and prompt tracking structure). This change brings Java output in line with expected LLMObs intake contract and behavior.

Additional Notes

openai-java-3.0 min version updated from 3.0.0 to 3.0.1.

DataDog/dd-apm-test-agent#280
DataDog/system-tests#6364

Contributor Checklist

Format the title according to the contribution guidelines
Assign the type: and (comp: or inst:) labels in addition to any other useful labels
Avoid using close, fix, or any linking keywords when referencing an issue
Use solves instead, and assign the PR milestone to the issue
Update the CODEOWNERS file on source file addition, migration, or deletion
Update public documentation with any new configuration flags or behaviors

Jira ticket: [PROJ-IDENT]

Note: Once your PR is ready to merge, add it to the merge queue by commenting /merge. /merge -c cancels the queue request. /merge -f --reason "reason" skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.

pr-commenter · 2026-02-19T22:33:17Z

Benchmarks

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	ygree/llmobs-systest-fixes
git_commit_date	1773939812	1773967007
git_commit_sha	`5580c61`	`661ea70`
release_version	1.61.0-SNAPSHOT~5580c61ac4	1.60.0-SNAPSHOT~661ea70f3f

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1773968886	1773968886
ci_job_id	1524138149	1524138149
ci_pipeline_id	103644030	103644030
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-ghsa1y67 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-ghsa1y67 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module	Agent	Agent
parent	None	None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 62 metrics, 9 unstable metrics.

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.062 s) : 0, 1062146
Total [baseline] (8.946 s) : 0, 8945766
Agent [candidate] (1.063 s) : 0, 1063058
Total [candidate] (8.897 s) : 0, 8897056
section iast
Agent [baseline] (1.231 s) : 0, 1231042
Total [baseline] (9.577 s) : 0, 9577256
Agent [candidate] (1.228 s) : 0, 1227961
Total [candidate] (9.564 s) : 0, 9563849

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.062 s	-
Agent	iast	1.231 s	168.897 ms (15.9%)
Total	tracing	8.946 s	-
Total	iast	9.577 s	631.49 ms (7.1%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.063 s	-
Agent	iast	1.228 s	164.903 ms (15.5%)
Total	tracing	8.897 s	-
Total	iast	9.564 s	666.793 ms (7.5%)

gantt
    title insecure-bank - break down per module: candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.208 ms) : 0, 1208
crashtracking [candidate] (1.212 ms) : 0, 1212
BytebuddyAgent [baseline] (632.573 ms) : 0, 632573
BytebuddyAgent [candidate] (633.335 ms) : 0, 633335
AgentMeter [baseline] (29.625 ms) : 0, 29625
AgentMeter [candidate] (29.57 ms) : 0, 29570
GlobalTracer [baseline] (258.037 ms) : 0, 258037
GlobalTracer [candidate] (258.161 ms) : 0, 258161
AppSec [baseline] (31.807 ms) : 0, 31807
AppSec [candidate] (31.969 ms) : 0, 31969
Debugger [baseline] (59.747 ms) : 0, 59747
Debugger [candidate] (59.697 ms) : 0, 59697
Remote Config [baseline] (602.128 µs) : 0, 602
Remote Config [candidate] (583.05 µs) : 0, 583
Telemetry [baseline] (8.081 ms) : 0, 8081
Telemetry [candidate] (8.845 ms) : 0, 8845
Flare Poller [baseline] (4.257 ms) : 0, 4257
Flare Poller [candidate] (3.518 ms) : 0, 3518
section iast
crashtracking [baseline] (1.202 ms) : 0, 1202
crashtracking [candidate] (1.222 ms) : 0, 1222
BytebuddyAgent [baseline] (798.555 ms) : 0, 798555
BytebuddyAgent [candidate] (797.664 ms) : 0, 797664
AgentMeter [baseline] (11.383 ms) : 0, 11383
AgentMeter [candidate] (11.405 ms) : 0, 11405
GlobalTracer [baseline] (248.238 ms) : 0, 248238
GlobalTracer [candidate] (247.671 ms) : 0, 247671
IAST [baseline] (25.407 ms) : 0, 25407
IAST [candidate] (25.413 ms) : 0, 25413
AppSec [baseline] (26.553 ms) : 0, 26553
AppSec [candidate] (26.439 ms) : 0, 26439
Debugger [baseline] (69.053 ms) : 0, 69053
Debugger [candidate] (67.291 ms) : 0, 67291
Remote Config [baseline] (531.369 µs) : 0, 531
Remote Config [candidate] (519.34 µs) : 0, 519
Telemetry [baseline] (10.261 ms) : 0, 10261
Telemetry [candidate] (10.542 ms) : 0, 10542
Flare Poller [baseline] (3.69 ms) : 0, 3690
Flare Poller [candidate] (3.754 ms) : 0, 3754

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.056 s) : 0, 1055849
Total [baseline] (11.13 s) : 0, 11129717
Agent [candidate] (1.073 s) : 0, 1072978
Total [candidate] (11.337 s) : 0, 11337301
section appsec
Agent [baseline] (1.248 s) : 0, 1247733
Total [baseline] (11.188 s) : 0, 11187973
Agent [candidate] (1.248 s) : 0, 1248251
Total [candidate] (11.131 s) : 0, 11131266
section iast
Agent [baseline] (1.229 s) : 0, 1228838
Total [baseline] (11.4 s) : 0, 11399785
Agent [candidate] (1.228 s) : 0, 1228338
Total [candidate] (11.36 s) : 0, 11359730
section profiling
Agent [baseline] (1.192 s) : 0, 1191915
Total [baseline] (11.009 s) : 0, 11008741
Agent [candidate] (1.184 s) : 0, 1184053
Total [candidate] (11.081 s) : 0, 11080559

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.056 s	-
Agent	appsec	1.248 s	191.884 ms (18.2%)
Agent	iast	1.229 s	172.989 ms (16.4%)
Agent	profiling	1.192 s	136.066 ms (12.9%)
Total	tracing	11.13 s	-
Total	appsec	11.188 s	58.256 ms (0.5%)
Total	iast	11.4 s	270.068 ms (2.4%)
Total	profiling	11.009 s	-120.976 ms (-1.1%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.073 s	-
Agent	appsec	1.248 s	175.273 ms (16.3%)
Agent	iast	1.228 s	155.359 ms (14.5%)
Agent	profiling	1.184 s	111.075 ms (10.4%)
Total	tracing	11.337 s	-
Total	appsec	11.131 s	-206.036 ms (-1.8%)
Total	iast	11.36 s	22.428 ms (0.2%)
Total	profiling	11.081 s	-256.743 ms (-2.3%)

gantt
    title petclinic - break down per module: candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.206 ms) : 0, 1206
crashtracking [candidate] (1.23 ms) : 0, 1230
BytebuddyAgent [baseline] (628.457 ms) : 0, 628457
BytebuddyAgent [candidate] (638.698 ms) : 0, 638698
AgentMeter [baseline] (29.438 ms) : 0, 29438
AgentMeter [candidate] (29.967 ms) : 0, 29967
GlobalTracer [baseline] (256.757 ms) : 0, 256757
GlobalTracer [candidate] (260.161 ms) : 0, 260161
AppSec [baseline] (31.65 ms) : 0, 31650
AppSec [candidate] (32.308 ms) : 0, 32308
Debugger [baseline] (60.267 ms) : 0, 60267
Debugger [candidate] (61.198 ms) : 0, 61198
Remote Config [baseline] (582.959 µs) : 0, 583
Remote Config [candidate] (595.351 µs) : 0, 595
Telemetry [baseline] (8.0 ms) : 0, 8000
Telemetry [candidate] (8.181 ms) : 0, 8181
Flare Poller [baseline] (3.471 ms) : 0, 3471
Flare Poller [candidate] (4.301 ms) : 0, 4301
section appsec
crashtracking [baseline] (1.217 ms) : 0, 1217
crashtracking [candidate] (1.195 ms) : 0, 1195
BytebuddyAgent [baseline] (658.46 ms) : 0, 658460
BytebuddyAgent [candidate] (658.919 ms) : 0, 658919
AgentMeter [baseline] (12.166 ms) : 0, 12166
AgentMeter [candidate] (12.095 ms) : 0, 12095
GlobalTracer [baseline] (258.464 ms) : 0, 258464
GlobalTracer [candidate] (258.347 ms) : 0, 258347
AppSec [baseline] (177.991 ms) : 0, 177991
AppSec [candidate] (178.556 ms) : 0, 178556
Debugger [baseline] (66.216 ms) : 0, 66216
Debugger [candidate] (66.058 ms) : 0, 66058
Remote Config [baseline] (638.592 µs) : 0, 639
Remote Config [candidate] (627.869 µs) : 0, 628
Telemetry [baseline] (8.389 ms) : 0, 8389
Telemetry [candidate] (8.292 ms) : 0, 8292
Flare Poller [baseline] (3.582 ms) : 0, 3582
Flare Poller [candidate] (3.581 ms) : 0, 3581
IAST [baseline] (24.211 ms) : 0, 24211
IAST [candidate] (24.187 ms) : 0, 24187
section iast
crashtracking [baseline] (1.182 ms) : 0, 1182
crashtracking [candidate] (1.186 ms) : 0, 1186
BytebuddyAgent [baseline] (796.816 ms) : 0, 796816
BytebuddyAgent [candidate] (796.903 ms) : 0, 796903
AgentMeter [baseline] (11.375 ms) : 0, 11375
AgentMeter [candidate] (11.408 ms) : 0, 11408
GlobalTracer [baseline] (247.492 ms) : 0, 247492
GlobalTracer [candidate] (247.512 ms) : 0, 247512
AppSec [baseline] (27.257 ms) : 0, 27257
AppSec [candidate] (26.423 ms) : 0, 26423
Debugger [baseline] (69.491 ms) : 0, 69491
Debugger [candidate] (70.449 ms) : 0, 70449
Remote Config [baseline] (533.328 µs) : 0, 533
Remote Config [candidate] (523.227 µs) : 0, 523
Telemetry [baseline] (9.727 ms) : 0, 9727
Telemetry [candidate] (9.119 ms) : 0, 9119
Flare Poller [baseline] (3.541 ms) : 0, 3541
Flare Poller [candidate] (3.311 ms) : 0, 3311
IAST [baseline] (25.364 ms) : 0, 25364
IAST [candidate] (25.36 ms) : 0, 25360
section profiling
ProfilingAgent [baseline] (94.247 ms) : 0, 94247
ProfilingAgent [candidate] (94.238 ms) : 0, 94238
crashtracking [baseline] (1.177 ms) : 0, 1177
crashtracking [candidate] (1.157 ms) : 0, 1157
BytebuddyAgent [baseline] (688.585 ms) : 0, 688585
BytebuddyAgent [candidate] (683.463 ms) : 0, 683463
AgentMeter [baseline] (9.087 ms) : 0, 9087
AgentMeter [candidate] (9.078 ms) : 0, 9078
GlobalTracer [baseline] (217.102 ms) : 0, 217102
GlobalTracer [candidate] (215.69 ms) : 0, 215690
AppSec [baseline] (32.268 ms) : 0, 32268
AppSec [candidate] (32.142 ms) : 0, 32142
Debugger [baseline] (65.681 ms) : 0, 65681
Debugger [candidate] (64.979 ms) : 0, 64979
Remote Config [baseline] (572.119 µs) : 0, 572
Remote Config [candidate] (562.643 µs) : 0, 563
Telemetry [baseline] (7.755 ms) : 0, 7755
Telemetry [candidate] (7.664 ms) : 0, 7664
Flare Poller [baseline] (4.204 ms) : 0, 4204
Flare Poller [candidate] (4.225 ms) : 0, 4225
Profiling [baseline] (94.805 ms) : 0, 94805
Profiling [candidate] (94.817 ms) : 0, 94817

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	ygree/llmobs-systest-fixes
git_commit_date	1773939812	1773967007
git_commit_sha	`5580c61`	`661ea70`
release_version	1.61.0-SNAPSHOT~5580c61ac4	1.60.0-SNAPSHOT~661ea70f3f

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1773969501	1773969501
ci_job_id	1524138150	1524138150
ci_pipeline_id	103644030	103644030
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-1-ivq6z8wy 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-1-ivq6z8wy 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 1 performance improvements and 2 performance regressions! Performance is the same for 18 metrics, 15 unstable metrics.

scenario	Δ mean agg_http_req_duration_p50	Δ mean agg_http_req_duration_p95	Δ mean throughput	candidate mean agg_http_req_duration_p50	candidate mean agg_http_req_duration_p95	candidate mean throughput	baseline mean agg_http_req_duration_p50	baseline mean agg_http_req_duration_p95	baseline mean throughput
scenario:load:insecure-bank:profiling:high_load	better [-258.871µs; -99.257µs] or [-14.096%; -5.405%]	unstable [-1347.503µs; -351.261µs] or [-23.946%; -6.242%]	unstable [+58.219op/s; +565.594op/s] or [+3.066%; +29.782%]	1.657ms	4.778ms	2211.000op/s	1.836ms	5.627ms	1899.094op/s
scenario:load:insecure-bank:iast:high_load	worse [+100.869µs; +173.831µs] or [+4.297%; +7.404%]	worse [+177.296µs; +601.339µs] or [+2.551%; +8.653%]	unstable [-255.490op/s; +71.928op/s] or [-16.911%; +4.761%]	2.485ms	7.339ms	1419.000op/s	2.348ms	6.950ms	1510.781op/s

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.178 ms) : 1167, 1189
.   : milestone, 1178,
iast (3.024 ms) : 2984, 3064
.   : milestone, 3024,
iast_FULL (5.861 ms) : 5803, 5920
.   : milestone, 5861,
iast_GLOBAL (3.635 ms) : 3572, 3698
.   : milestone, 3635,
profiling (2.39 ms) : 2367, 2412
.   : milestone, 2390,
tracing (1.778 ms) : 1762, 1793
.   : milestone, 1778,
section candidate
no_agent (1.212 ms) : 1200, 1224
.   : milestone, 1212,
iast (3.224 ms) : 3179, 3269
.   : milestone, 3224,
iast_FULL (5.934 ms) : 5874, 5994
.   : milestone, 5934,
iast_GLOBAL (3.566 ms) : 3501, 3630
.   : milestone, 3566,
profiling (2.042 ms) : 2024, 2061
.   : milestone, 2042,
tracing (1.792 ms) : 1777, 1807
.   : milestone, 1792,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.178 ms [1.167 ms, 1.189 ms]	-
iast	3.024 ms [2.984 ms, 3.064 ms]	1.846 ms (156.7%)
iast_FULL	5.861 ms [5.803 ms, 5.92 ms]	4.683 ms (397.6%)
iast_GLOBAL	3.635 ms [3.572 ms, 3.698 ms]	2.457 ms (208.6%)
profiling	2.39 ms [2.367 ms, 2.412 ms]	1.212 ms (102.9%)
tracing	1.778 ms [1.762 ms, 1.793 ms]	599.953 µs (50.9%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.212 ms [1.2 ms, 1.224 ms]	-
iast	3.224 ms [3.179 ms, 3.269 ms]	2.011 ms (165.9%)
iast_FULL	5.934 ms [5.874 ms, 5.994 ms]	4.722 ms (389.5%)
iast_GLOBAL	3.566 ms [3.501 ms, 3.63 ms]	2.354 ms (194.1%)
profiling	2.042 ms [2.024 ms, 2.061 ms]	830.007 µs (68.5%)
tracing	1.792 ms [1.777 ms, 1.807 ms]	579.913 µs (47.8%)

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4
    dateFormat X
    axisFormat %s
section baseline
no_agent (16.795 ms) : 16635, 16955
.   : milestone, 16795,
appsec (18.917 ms) : 18726, 19108
.   : milestone, 18917,
code_origins (18.247 ms) : 18064, 18430
.   : milestone, 18247,
iast (17.787 ms) : 17609, 17965
.   : milestone, 17787,
profiling (18.689 ms) : 18503, 18875
.   : milestone, 18689,
tracing (18.129 ms) : 17947, 18310
.   : milestone, 18129,
section candidate
no_agent (17.098 ms) : 16927, 17269
.   : milestone, 17098,
appsec (18.665 ms) : 18474, 18856
.   : milestone, 18665,
code_origins (17.847 ms) : 17670, 18025
.   : milestone, 17847,
iast (17.525 ms) : 17353, 17698
.   : milestone, 17525,
profiling (18.391 ms) : 18204, 18578
.   : milestone, 18391,
tracing (18.746 ms) : 18557, 18935
.   : milestone, 18746,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	16.795 ms [16.635 ms, 16.955 ms]	-
appsec	18.917 ms [18.726 ms, 19.108 ms]	2.122 ms (12.6%)
code_origins	18.247 ms [18.064 ms, 18.43 ms]	1.451 ms (8.6%)
iast	17.787 ms [17.609 ms, 17.965 ms]	992.025 µs (5.9%)
profiling	18.689 ms [18.503 ms, 18.875 ms]	1.894 ms (11.3%)
tracing	18.129 ms [17.947 ms, 18.31 ms]	1.333 ms (7.9%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	17.098 ms [16.927 ms, 17.269 ms]	-
appsec	18.665 ms [18.474 ms, 18.856 ms]	1.567 ms (9.2%)
code_origins	17.847 ms [17.67 ms, 18.025 ms]	749.055 µs (4.4%)
iast	17.525 ms [17.353 ms, 17.698 ms]	427.059 µs (2.5%)
profiling	18.391 ms [18.204 ms, 18.578 ms]	1.292 ms (7.6%)
tracing	18.746 ms [18.557 ms, 18.935 ms]	1.648 ms (9.6%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	ygree/llmobs-systest-fixes
git_commit_date	1773939812	1773967007
git_commit_sha	`5580c61`	`661ea70`
release_version	1.61.0-SNAPSHOT~5580c61ac4	1.60.0-SNAPSHOT~661ea70f3f

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1773969210	1773969210
ci_job_id	1524138151	1524138151
ci_pipeline_id	103644030	103644030
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-2-mtrznoae 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-2-mtrznoae 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 2 unstable metrics.

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.808 s) : 14808000, 14808000
.   : milestone, 14808000,
appsec (14.711 s) : 14711000, 14711000
.   : milestone, 14711000,
iast (18.493 s) : 18493000, 18493000
.   : milestone, 18493000,
iast_GLOBAL (18.016 s) : 18016000, 18016000
.   : milestone, 18016000,
profiling (15.504 s) : 15504000, 15504000
.   : milestone, 15504000,
tracing (14.872 s) : 14872000, 14872000
.   : milestone, 14872000,
section candidate
no_agent (15.496 s) : 15496000, 15496000
.   : milestone, 15496000,
appsec (14.996 s) : 14996000, 14996000
.   : milestone, 14996000,
iast (18.438 s) : 18438000, 18438000
.   : milestone, 18438000,
iast_GLOBAL (17.796 s) : 17796000, 17796000
.   : milestone, 17796000,
profiling (14.991 s) : 14991000, 14991000
.   : milestone, 14991000,
tracing (15.121 s) : 15121000, 15121000
.   : milestone, 15121000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	14.808 s [14.808 s, 14.808 s]	-
appsec	14.711 s [14.711 s, 14.711 s]	-97.0 ms (-0.7%)
iast	18.493 s [18.493 s, 18.493 s]	3.685 s (24.9%)
iast_GLOBAL	18.016 s [18.016 s, 18.016 s]	3.208 s (21.7%)
profiling	15.504 s [15.504 s, 15.504 s]	696.0 ms (4.7%)
tracing	14.872 s [14.872 s, 14.872 s]	64.0 ms (0.4%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.496 s [15.496 s, 15.496 s]	-
appsec	14.996 s [14.996 s, 14.996 s]	-500.0 ms (-3.2%)
iast	18.438 s [18.438 s, 18.438 s]	2.942 s (19.0%)
iast_GLOBAL	17.796 s [17.796 s, 17.796 s]	2.3 s (14.8%)
profiling	14.991 s [14.991 s, 14.991 s]	-505.0 ms (-3.3%)
tracing	15.121 s [15.121 s, 15.121 s]	-375.0 ms (-2.4%)

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.487 ms) : 1475, 1499
.   : milestone, 1487,
appsec (2.526 ms) : 2472, 2580
.   : milestone, 2526,
iast (2.261 ms) : 2193, 2330
.   : milestone, 2261,
iast_GLOBAL (2.318 ms) : 2249, 2388
.   : milestone, 2318,
profiling (2.091 ms) : 2037, 2146
.   : milestone, 2091,
tracing (2.067 ms) : 2014, 2120
.   : milestone, 2067,
section candidate
no_agent (1.481 ms) : 1470, 1493
.   : milestone, 1481,
appsec (3.828 ms) : 3602, 4055
.   : milestone, 3828,
iast (2.277 ms) : 2208, 2346
.   : milestone, 2277,
iast_GLOBAL (2.314 ms) : 2245, 2383
.   : milestone, 2314,
profiling (2.547 ms) : 2330, 2763
.   : milestone, 2547,
tracing (2.068 ms) : 2015, 2121
.   : milestone, 2068,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.487 ms [1.475 ms, 1.499 ms]	-
appsec	2.526 ms [2.472 ms, 2.58 ms]	1.039 ms (69.9%)
iast	2.261 ms [2.193 ms, 2.33 ms]	774.534 µs (52.1%)
iast_GLOBAL	2.318 ms [2.249 ms, 2.388 ms]	831.54 µs (55.9%)
profiling	2.091 ms [2.037 ms, 2.146 ms]	604.365 µs (40.6%)
tracing	2.067 ms [2.014 ms, 2.12 ms]	580.44 µs (39.0%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.481 ms [1.47 ms, 1.493 ms]	-
appsec	3.828 ms [3.602 ms, 4.055 ms]	2.347 ms (158.4%)
iast	2.277 ms [2.208 ms, 2.346 ms]	796.012 µs (53.7%)
iast_GLOBAL	2.314 ms [2.245 ms, 2.383 ms]	832.58 µs (56.2%)
profiling	2.547 ms [2.33 ms, 2.763 ms]	1.065 ms (71.9%)
tracing	2.068 ms [2.015 ms, 2.121 ms]	586.444 µs (39.6%)

…wthTestOpenAiLlmInteractions::test_completion

…teractions::test_chat_completion_tool_call

…d with python openai instrumentation and system-tests

… with variables + chat_template, longest-first overlap handling) and support map-based LLM input serialization (messages + prompt) in LLMObs mapper. Also filter empty instruction messages to match system-test expectations.

…st and return [image] (not empty) when stripped input_image URLs are missing, aligning mixed-input chat_template output with expected behavior.

…output.messages from request params so existing error-span tests pass.

…ol_definitions tags

…JSON argument parsing and remove duplicate manual parsing logic from ResponseDecorator.

ygree · 2026-03-17T22:11:55Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0c879ba692

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

...enai-java-3.0/src/main/java/datadog/trace/instrumentation/openai_java/ResponseDecorator.java

ygree self-assigned this Feb 19, 2026

ygree added comp: mlobs ML Observability (LLMObs) type: bug Bug report and fix labels Feb 19, 2026

llmobs: set model tag even when llmobs disabled

cbd6226

ygree force-pushed the ygree/llmobs-systest-fixes branch from 5cd257e to cbd6226 Compare February 24, 2026 09:31

ygree changed the title ~~llmobs: set model tag even when llmobs disabled~~ fix(llmobs): set model tag even when llmobs disabled Mar 2, 2026

ygree added 23 commits March 2, 2026 13:30

Set metadata.stream tag no matter it's true or false

4f27673

Set chat/completion CACHE_READ_INPUT_TOKENS tag

d128d6b

Set error nad error_type tags

3fc5ceb

Use "" instead of null for the role in CompletionDecorator to comply …

021a9d1

…wthTestOpenAiLlmInteractions::test_completion

Use "" instead of null for the content to comply with TestOpenAiLlmIn…

0637931

…teractions::test_chat_completion_tool_call

Add missing metatadata.tool_choice

0cb41e1

Add missing tool_definitions

a42f8aa

Add source:integration tag

6e10255

Add missing _dd attribute to the llmobs span event

34f3a07

Add missing error tags

a0c1139

Remove error from the llmobs span event. It must be part of meta block

effc343

Add missing meta.text.verbosity

c0e3876

Add summaryText and encrypted_content

b000770

Add missing tool_calls and tool_results for responses

53471a2

Always set stream param to produce the same request body to be aligne…

2207c46

…d with python openai instrumentation and system-tests

Fix OpenAI Responses prompt tracking to use response instructions fir…

7d683b6

…st and return [image] (not empty) when stripped input_image URLs are missing, aligning mixed-input chat_template output with expected behavior.

Set LLMObs error-path defaults in Java to always emit model_name and …

2c17ddc

…output.messages from request params so existing error-span tests pass.

Add OpenAI Responses tool definition extraction to populate LLMObs to…

ad3b782

…ol_definitions tags

Fix ChatCompletionServiceTest

1810327

Extract JsonValueUtils

46221e4

Refactor OpenAI responses instrumentation to reuse ToolCallExtractor …

61ad667

…JSON argument parsing and remove duplicate manual parsing logic from ResponseDecorator.

Fix test assertions

f0957b7

ygree added 5 commits March 6, 2026 10:35

Add integration tag

f3f1f75

Add ddtrace.verion

668e955

Improve test assertions

d57402e

Merge branch 'master' into ygree/llmobs-systest-fixes

a3051e3

Fix format

0c879ba

ygree changed the title ~~fix(llmobs): set model tag even when llmobs disabled~~ fix(llmobs): openai-java payload mapping for responses, tool metadata, and prompt tracking Mar 6, 2026

ygree added tag: ai generated Largely based on code generated by an AI or LLM tag: no release notes Changes to exclude from release notes labels Mar 6, 2026

ygree marked this pull request as ready for review March 6, 2026 13:46

ygree requested review from a team as code owners March 6, 2026 13:46

chatgpt-codex-connector bot reviewed Mar 17, 2026

View reviewed changes

...enai-java-3.0/src/main/java/datadog/trace/instrumentation/openai_java/ResponseDecorator.java Outdated Show resolved Hide resolved

...enai-java-3.0/src/main/java/datadog/trace/instrumentation/openai_java/ResponseDecorator.java Outdated Show resolved Hide resolved

ygree added 5 commits March 17, 2026 17:35

Include input messages when instructions are present in prompt tracking

f4e3a8b

Fix instructions role to system in prompt tracking

028d64f

Merge branch 'master' into ygree/llmobs-systest-fixes

82f4303

fix LLMObsSpanMapperTest

717a8f0

repoint to fixed system tests

661ea70

ygree requested a review from a team as a code owner March 20, 2026 00:37

ygree requested review from amarziali and removed request for a team March 20, 2026 00:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llmobs): openai-java payload mapping for responses, tool metadata, and prompt tracking#10644

fix(llmobs): openai-java payload mapping for responses, tool metadata, and prompt tracking#10644
ygree wants to merge 34 commits intomasterfrom
ygree/llmobs-systest-fixes

ygree commented Feb 19, 2026 •

edited

Loading

Uh oh!

pr-commenter bot commented Feb 19, 2026 •

edited

Loading

Uh oh!

ygree commented Mar 17, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ygree commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Does This Do

Motivation

Additional Notes

Contributor Checklist

Uh oh!

pr-commenter bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

Uh oh!

ygree commented Mar 17, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ygree commented Feb 19, 2026 •

edited

Loading

pr-commenter bot commented Feb 19, 2026 •

edited

Loading