Skip to content

fix(llmobs): openai-java payload mapping for responses, tool metadata, and prompt tracking#10644

Open
ygree wants to merge 34 commits intomasterfrom
ygree/llmobs-systest-fixes
Open

fix(llmobs): openai-java payload mapping for responses, tool metadata, and prompt tracking#10644
ygree wants to merge 34 commits intomasterfrom
ygree/llmobs-systest-fixes

Conversation

@ygree
Copy link
Contributor

@ygree ygree commented Feb 19, 2026

What Does This Do

Aligns OpenAI Java LLMObs span payloads with expected intake/system-test schema by:

  • Adding/filling missing LLMObs tags:
    • _ml_obs_tag.integration
    • _ml_obs_tag.source
    • _ml_obs_tag.ddtrace.version
    • _ml_obs_tag.error
    • _ml_obs_tag.error_type
  • Ensuring model_name (and stable placeholder output where applicable) is set on error paths for
    chat/completions/embeddings/responses.
  • Expanding Responses instrumentation:
    • prompt tracking (input.prompt, variables, chat_template)
    • tool definition extraction (tool_definitions)
    • tool call/result extraction across function/custom/MCP outputs
    • metadata normalization (stream, tool_choice, text.verbosity, etc.)
  • Refactoring JSON conversion via shared JsonValueUtils.
  • Updating LLMObs mapper payload shape:
    • writes _dd map with span/trace ids
    • nests error fields under meta.error
    • supports map-based LLM input serialization (messages + prompt)
    • remaps tool_definitions into meta.
  • Updating tests to add value-level assertions for the above behavior.

Motivation

OpenAI/LLMObs system tests exposed schema and tag mismatches in Java payloads (especially response spans, tool metadata, error mapping, and prompt tracking structure). This change brings Java output in line with expected LLMObs intake contract and behavior.

Additional Notes

  • openai-java-3.0 min version updated from 3.0.0 to 3.0.1.

DataDog/dd-apm-test-agent#280
DataDog/system-tests#6364

Contributor Checklist

Jira ticket: [PROJ-IDENT]

Note: Once your PR is ready to merge, add it to the merge queue by commenting /merge. /merge -c cancels the queue request. /merge -f --reason "reason" skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.

@ygree ygree self-assigned this Feb 19, 2026
@ygree ygree added comp: mlobs ML Observability (LLMObs) type: bug Bug report and fix labels Feb 19, 2026
@pr-commenter
Copy link

pr-commenter bot commented Feb 19, 2026

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master ygree/llmobs-systest-fixes
git_commit_date 1773939812 1773967007
git_commit_sha 5580c61 661ea70
release_version 1.61.0-SNAPSHOT~5580c61ac4 1.60.0-SNAPSHOT~661ea70f3f
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1773968886 1773968886
ci_job_id 1524138149 1524138149
ci_pipeline_id 103644030 103644030
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-ghsa1y67 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-ghsa1y67 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 62 metrics, 9 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.062 s) : 0, 1062146
Total [baseline] (8.946 s) : 0, 8945766
Agent [candidate] (1.063 s) : 0, 1063058
Total [candidate] (8.897 s) : 0, 8897056
section iast
Agent [baseline] (1.231 s) : 0, 1231042
Total [baseline] (9.577 s) : 0, 9577256
Agent [candidate] (1.228 s) : 0, 1227961
Total [candidate] (9.564 s) : 0, 9563849
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.062 s -
Agent iast 1.231 s 168.897 ms (15.9%)
Total tracing 8.946 s -
Total iast 9.577 s 631.49 ms (7.1%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.063 s -
Agent iast 1.228 s 164.903 ms (15.5%)
Total tracing 8.897 s -
Total iast 9.564 s 666.793 ms (7.5%)
gantt
    title insecure-bank - break down per module: candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.208 ms) : 0, 1208
crashtracking [candidate] (1.212 ms) : 0, 1212
BytebuddyAgent [baseline] (632.573 ms) : 0, 632573
BytebuddyAgent [candidate] (633.335 ms) : 0, 633335
AgentMeter [baseline] (29.625 ms) : 0, 29625
AgentMeter [candidate] (29.57 ms) : 0, 29570
GlobalTracer [baseline] (258.037 ms) : 0, 258037
GlobalTracer [candidate] (258.161 ms) : 0, 258161
AppSec [baseline] (31.807 ms) : 0, 31807
AppSec [candidate] (31.969 ms) : 0, 31969
Debugger [baseline] (59.747 ms) : 0, 59747
Debugger [candidate] (59.697 ms) : 0, 59697
Remote Config [baseline] (602.128 µs) : 0, 602
Remote Config [candidate] (583.05 µs) : 0, 583
Telemetry [baseline] (8.081 ms) : 0, 8081
Telemetry [candidate] (8.845 ms) : 0, 8845
Flare Poller [baseline] (4.257 ms) : 0, 4257
Flare Poller [candidate] (3.518 ms) : 0, 3518
section iast
crashtracking [baseline] (1.202 ms) : 0, 1202
crashtracking [candidate] (1.222 ms) : 0, 1222
BytebuddyAgent [baseline] (798.555 ms) : 0, 798555
BytebuddyAgent [candidate] (797.664 ms) : 0, 797664
AgentMeter [baseline] (11.383 ms) : 0, 11383
AgentMeter [candidate] (11.405 ms) : 0, 11405
GlobalTracer [baseline] (248.238 ms) : 0, 248238
GlobalTracer [candidate] (247.671 ms) : 0, 247671
IAST [baseline] (25.407 ms) : 0, 25407
IAST [candidate] (25.413 ms) : 0, 25413
AppSec [baseline] (26.553 ms) : 0, 26553
AppSec [candidate] (26.439 ms) : 0, 26439
Debugger [baseline] (69.053 ms) : 0, 69053
Debugger [candidate] (67.291 ms) : 0, 67291
Remote Config [baseline] (531.369 µs) : 0, 531
Remote Config [candidate] (519.34 µs) : 0, 519
Telemetry [baseline] (10.261 ms) : 0, 10261
Telemetry [candidate] (10.542 ms) : 0, 10542
Flare Poller [baseline] (3.69 ms) : 0, 3690
Flare Poller [candidate] (3.754 ms) : 0, 3754
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.056 s) : 0, 1055849
Total [baseline] (11.13 s) : 0, 11129717
Agent [candidate] (1.073 s) : 0, 1072978
Total [candidate] (11.337 s) : 0, 11337301
section appsec
Agent [baseline] (1.248 s) : 0, 1247733
Total [baseline] (11.188 s) : 0, 11187973
Agent [candidate] (1.248 s) : 0, 1248251
Total [candidate] (11.131 s) : 0, 11131266
section iast
Agent [baseline] (1.229 s) : 0, 1228838
Total [baseline] (11.4 s) : 0, 11399785
Agent [candidate] (1.228 s) : 0, 1228338
Total [candidate] (11.36 s) : 0, 11359730
section profiling
Agent [baseline] (1.192 s) : 0, 1191915
Total [baseline] (11.009 s) : 0, 11008741
Agent [candidate] (1.184 s) : 0, 1184053
Total [candidate] (11.081 s) : 0, 11080559
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.056 s -
Agent appsec 1.248 s 191.884 ms (18.2%)
Agent iast 1.229 s 172.989 ms (16.4%)
Agent profiling 1.192 s 136.066 ms (12.9%)
Total tracing 11.13 s -
Total appsec 11.188 s 58.256 ms (0.5%)
Total iast 11.4 s 270.068 ms (2.4%)
Total profiling 11.009 s -120.976 ms (-1.1%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.073 s -
Agent appsec 1.248 s 175.273 ms (16.3%)
Agent iast 1.228 s 155.359 ms (14.5%)
Agent profiling 1.184 s 111.075 ms (10.4%)
Total tracing 11.337 s -
Total appsec 11.131 s -206.036 ms (-1.8%)
Total iast 11.36 s 22.428 ms (0.2%)
Total profiling 11.081 s -256.743 ms (-2.3%)
gantt
    title petclinic - break down per module: candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.206 ms) : 0, 1206
crashtracking [candidate] (1.23 ms) : 0, 1230
BytebuddyAgent [baseline] (628.457 ms) : 0, 628457
BytebuddyAgent [candidate] (638.698 ms) : 0, 638698
AgentMeter [baseline] (29.438 ms) : 0, 29438
AgentMeter [candidate] (29.967 ms) : 0, 29967
GlobalTracer [baseline] (256.757 ms) : 0, 256757
GlobalTracer [candidate] (260.161 ms) : 0, 260161
AppSec [baseline] (31.65 ms) : 0, 31650
AppSec [candidate] (32.308 ms) : 0, 32308
Debugger [baseline] (60.267 ms) : 0, 60267
Debugger [candidate] (61.198 ms) : 0, 61198
Remote Config [baseline] (582.959 µs) : 0, 583
Remote Config [candidate] (595.351 µs) : 0, 595
Telemetry [baseline] (8.0 ms) : 0, 8000
Telemetry [candidate] (8.181 ms) : 0, 8181
Flare Poller [baseline] (3.471 ms) : 0, 3471
Flare Poller [candidate] (4.301 ms) : 0, 4301
section appsec
crashtracking [baseline] (1.217 ms) : 0, 1217
crashtracking [candidate] (1.195 ms) : 0, 1195
BytebuddyAgent [baseline] (658.46 ms) : 0, 658460
BytebuddyAgent [candidate] (658.919 ms) : 0, 658919
AgentMeter [baseline] (12.166 ms) : 0, 12166
AgentMeter [candidate] (12.095 ms) : 0, 12095
GlobalTracer [baseline] (258.464 ms) : 0, 258464
GlobalTracer [candidate] (258.347 ms) : 0, 258347
AppSec [baseline] (177.991 ms) : 0, 177991
AppSec [candidate] (178.556 ms) : 0, 178556
Debugger [baseline] (66.216 ms) : 0, 66216
Debugger [candidate] (66.058 ms) : 0, 66058
Remote Config [baseline] (638.592 µs) : 0, 639
Remote Config [candidate] (627.869 µs) : 0, 628
Telemetry [baseline] (8.389 ms) : 0, 8389
Telemetry [candidate] (8.292 ms) : 0, 8292
Flare Poller [baseline] (3.582 ms) : 0, 3582
Flare Poller [candidate] (3.581 ms) : 0, 3581
IAST [baseline] (24.211 ms) : 0, 24211
IAST [candidate] (24.187 ms) : 0, 24187
section iast
crashtracking [baseline] (1.182 ms) : 0, 1182
crashtracking [candidate] (1.186 ms) : 0, 1186
BytebuddyAgent [baseline] (796.816 ms) : 0, 796816
BytebuddyAgent [candidate] (796.903 ms) : 0, 796903
AgentMeter [baseline] (11.375 ms) : 0, 11375
AgentMeter [candidate] (11.408 ms) : 0, 11408
GlobalTracer [baseline] (247.492 ms) : 0, 247492
GlobalTracer [candidate] (247.512 ms) : 0, 247512
AppSec [baseline] (27.257 ms) : 0, 27257
AppSec [candidate] (26.423 ms) : 0, 26423
Debugger [baseline] (69.491 ms) : 0, 69491
Debugger [candidate] (70.449 ms) : 0, 70449
Remote Config [baseline] (533.328 µs) : 0, 533
Remote Config [candidate] (523.227 µs) : 0, 523
Telemetry [baseline] (9.727 ms) : 0, 9727
Telemetry [candidate] (9.119 ms) : 0, 9119
Flare Poller [baseline] (3.541 ms) : 0, 3541
Flare Poller [candidate] (3.311 ms) : 0, 3311
IAST [baseline] (25.364 ms) : 0, 25364
IAST [candidate] (25.36 ms) : 0, 25360
section profiling
ProfilingAgent [baseline] (94.247 ms) : 0, 94247
ProfilingAgent [candidate] (94.238 ms) : 0, 94238
crashtracking [baseline] (1.177 ms) : 0, 1177
crashtracking [candidate] (1.157 ms) : 0, 1157
BytebuddyAgent [baseline] (688.585 ms) : 0, 688585
BytebuddyAgent [candidate] (683.463 ms) : 0, 683463
AgentMeter [baseline] (9.087 ms) : 0, 9087
AgentMeter [candidate] (9.078 ms) : 0, 9078
GlobalTracer [baseline] (217.102 ms) : 0, 217102
GlobalTracer [candidate] (215.69 ms) : 0, 215690
AppSec [baseline] (32.268 ms) : 0, 32268
AppSec [candidate] (32.142 ms) : 0, 32142
Debugger [baseline] (65.681 ms) : 0, 65681
Debugger [candidate] (64.979 ms) : 0, 64979
Remote Config [baseline] (572.119 µs) : 0, 572
Remote Config [candidate] (562.643 µs) : 0, 563
Telemetry [baseline] (7.755 ms) : 0, 7755
Telemetry [candidate] (7.664 ms) : 0, 7664
Flare Poller [baseline] (4.204 ms) : 0, 4204
Flare Poller [candidate] (4.225 ms) : 0, 4225
Profiling [baseline] (94.805 ms) : 0, 94805
Profiling [candidate] (94.817 ms) : 0, 94817
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master ygree/llmobs-systest-fixes
git_commit_date 1773939812 1773967007
git_commit_sha 5580c61 661ea70
release_version 1.61.0-SNAPSHOT~5580c61ac4 1.60.0-SNAPSHOT~661ea70f3f
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1773969501 1773969501
ci_job_id 1524138150 1524138150
ci_pipeline_id 103644030 103644030
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-ivq6z8wy 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-ivq6z8wy 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 1 performance improvements and 2 performance regressions! Performance is the same for 18 metrics, 15 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:insecure-bank:profiling:high_load better
[-258.871µs; -99.257µs] or [-14.096%; -5.405%]
unstable
[-1347.503µs; -351.261µs] or [-23.946%; -6.242%]
unstable
[+58.219op/s; +565.594op/s] or [+3.066%; +29.782%]
1.657ms 4.778ms 2211.000op/s 1.836ms 5.627ms 1899.094op/s
scenario:load:insecure-bank:iast:high_load worse
[+100.869µs; +173.831µs] or [+4.297%; +7.404%]
worse
[+177.296µs; +601.339µs] or [+2.551%; +8.653%]
unstable
[-255.490op/s; +71.928op/s] or [-16.911%; +4.761%]
2.485ms 7.339ms 1419.000op/s 2.348ms 6.950ms 1510.781op/s
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.178 ms) : 1167, 1189
.   : milestone, 1178,
iast (3.024 ms) : 2984, 3064
.   : milestone, 3024,
iast_FULL (5.861 ms) : 5803, 5920
.   : milestone, 5861,
iast_GLOBAL (3.635 ms) : 3572, 3698
.   : milestone, 3635,
profiling (2.39 ms) : 2367, 2412
.   : milestone, 2390,
tracing (1.778 ms) : 1762, 1793
.   : milestone, 1778,
section candidate
no_agent (1.212 ms) : 1200, 1224
.   : milestone, 1212,
iast (3.224 ms) : 3179, 3269
.   : milestone, 3224,
iast_FULL (5.934 ms) : 5874, 5994
.   : milestone, 5934,
iast_GLOBAL (3.566 ms) : 3501, 3630
.   : milestone, 3566,
profiling (2.042 ms) : 2024, 2061
.   : milestone, 2042,
tracing (1.792 ms) : 1777, 1807
.   : milestone, 1792,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.178 ms [1.167 ms, 1.189 ms] -
iast 3.024 ms [2.984 ms, 3.064 ms] 1.846 ms (156.7%)
iast_FULL 5.861 ms [5.803 ms, 5.92 ms] 4.683 ms (397.6%)
iast_GLOBAL 3.635 ms [3.572 ms, 3.698 ms] 2.457 ms (208.6%)
profiling 2.39 ms [2.367 ms, 2.412 ms] 1.212 ms (102.9%)
tracing 1.778 ms [1.762 ms, 1.793 ms] 599.953 µs (50.9%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.212 ms [1.2 ms, 1.224 ms] -
iast 3.224 ms [3.179 ms, 3.269 ms] 2.011 ms (165.9%)
iast_FULL 5.934 ms [5.874 ms, 5.994 ms] 4.722 ms (389.5%)
iast_GLOBAL 3.566 ms [3.501 ms, 3.63 ms] 2.354 ms (194.1%)
profiling 2.042 ms [2.024 ms, 2.061 ms] 830.007 µs (68.5%)
tracing 1.792 ms [1.777 ms, 1.807 ms] 579.913 µs (47.8%)
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4
    dateFormat X
    axisFormat %s
section baseline
no_agent (16.795 ms) : 16635, 16955
.   : milestone, 16795,
appsec (18.917 ms) : 18726, 19108
.   : milestone, 18917,
code_origins (18.247 ms) : 18064, 18430
.   : milestone, 18247,
iast (17.787 ms) : 17609, 17965
.   : milestone, 17787,
profiling (18.689 ms) : 18503, 18875
.   : milestone, 18689,
tracing (18.129 ms) : 17947, 18310
.   : milestone, 18129,
section candidate
no_agent (17.098 ms) : 16927, 17269
.   : milestone, 17098,
appsec (18.665 ms) : 18474, 18856
.   : milestone, 18665,
code_origins (17.847 ms) : 17670, 18025
.   : milestone, 17847,
iast (17.525 ms) : 17353, 17698
.   : milestone, 17525,
profiling (18.391 ms) : 18204, 18578
.   : milestone, 18391,
tracing (18.746 ms) : 18557, 18935
.   : milestone, 18746,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 16.795 ms [16.635 ms, 16.955 ms] -
appsec 18.917 ms [18.726 ms, 19.108 ms] 2.122 ms (12.6%)
code_origins 18.247 ms [18.064 ms, 18.43 ms] 1.451 ms (8.6%)
iast 17.787 ms [17.609 ms, 17.965 ms] 992.025 µs (5.9%)
profiling 18.689 ms [18.503 ms, 18.875 ms] 1.894 ms (11.3%)
tracing 18.129 ms [17.947 ms, 18.31 ms] 1.333 ms (7.9%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 17.098 ms [16.927 ms, 17.269 ms] -
appsec 18.665 ms [18.474 ms, 18.856 ms] 1.567 ms (9.2%)
code_origins 17.847 ms [17.67 ms, 18.025 ms] 749.055 µs (4.4%)
iast 17.525 ms [17.353 ms, 17.698 ms] 427.059 µs (2.5%)
profiling 18.391 ms [18.204 ms, 18.578 ms] 1.292 ms (7.6%)
tracing 18.746 ms [18.557 ms, 18.935 ms] 1.648 ms (9.6%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master ygree/llmobs-systest-fixes
git_commit_date 1773939812 1773967007
git_commit_sha 5580c61 661ea70
release_version 1.61.0-SNAPSHOT~5580c61ac4 1.60.0-SNAPSHOT~661ea70f3f
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1773969210 1773969210
ci_job_id 1524138151 1524138151
ci_pipeline_id 103644030 103644030
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-2-mtrznoae 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-2-mtrznoae 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 2 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.808 s) : 14808000, 14808000
.   : milestone, 14808000,
appsec (14.711 s) : 14711000, 14711000
.   : milestone, 14711000,
iast (18.493 s) : 18493000, 18493000
.   : milestone, 18493000,
iast_GLOBAL (18.016 s) : 18016000, 18016000
.   : milestone, 18016000,
profiling (15.504 s) : 15504000, 15504000
.   : milestone, 15504000,
tracing (14.872 s) : 14872000, 14872000
.   : milestone, 14872000,
section candidate
no_agent (15.496 s) : 15496000, 15496000
.   : milestone, 15496000,
appsec (14.996 s) : 14996000, 14996000
.   : milestone, 14996000,
iast (18.438 s) : 18438000, 18438000
.   : milestone, 18438000,
iast_GLOBAL (17.796 s) : 17796000, 17796000
.   : milestone, 17796000,
profiling (14.991 s) : 14991000, 14991000
.   : milestone, 14991000,
tracing (15.121 s) : 15121000, 15121000
.   : milestone, 15121000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.808 s [14.808 s, 14.808 s] -
appsec 14.711 s [14.711 s, 14.711 s] -97.0 ms (-0.7%)
iast 18.493 s [18.493 s, 18.493 s] 3.685 s (24.9%)
iast_GLOBAL 18.016 s [18.016 s, 18.016 s] 3.208 s (21.7%)
profiling 15.504 s [15.504 s, 15.504 s] 696.0 ms (4.7%)
tracing 14.872 s [14.872 s, 14.872 s] 64.0 ms (0.4%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.496 s [15.496 s, 15.496 s] -
appsec 14.996 s [14.996 s, 14.996 s] -500.0 ms (-3.2%)
iast 18.438 s [18.438 s, 18.438 s] 2.942 s (19.0%)
iast_GLOBAL 17.796 s [17.796 s, 17.796 s] 2.3 s (14.8%)
profiling 14.991 s [14.991 s, 14.991 s] -505.0 ms (-3.3%)
tracing 15.121 s [15.121 s, 15.121 s] -375.0 ms (-2.4%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~661ea70f3f, baseline=1.61.0-SNAPSHOT~5580c61ac4
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.487 ms) : 1475, 1499
.   : milestone, 1487,
appsec (2.526 ms) : 2472, 2580
.   : milestone, 2526,
iast (2.261 ms) : 2193, 2330
.   : milestone, 2261,
iast_GLOBAL (2.318 ms) : 2249, 2388
.   : milestone, 2318,
profiling (2.091 ms) : 2037, 2146
.   : milestone, 2091,
tracing (2.067 ms) : 2014, 2120
.   : milestone, 2067,
section candidate
no_agent (1.481 ms) : 1470, 1493
.   : milestone, 1481,
appsec (3.828 ms) : 3602, 4055
.   : milestone, 3828,
iast (2.277 ms) : 2208, 2346
.   : milestone, 2277,
iast_GLOBAL (2.314 ms) : 2245, 2383
.   : milestone, 2314,
profiling (2.547 ms) : 2330, 2763
.   : milestone, 2547,
tracing (2.068 ms) : 2015, 2121
.   : milestone, 2068,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.487 ms [1.475 ms, 1.499 ms] -
appsec 2.526 ms [2.472 ms, 2.58 ms] 1.039 ms (69.9%)
iast 2.261 ms [2.193 ms, 2.33 ms] 774.534 µs (52.1%)
iast_GLOBAL 2.318 ms [2.249 ms, 2.388 ms] 831.54 µs (55.9%)
profiling 2.091 ms [2.037 ms, 2.146 ms] 604.365 µs (40.6%)
tracing 2.067 ms [2.014 ms, 2.12 ms] 580.44 µs (39.0%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.481 ms [1.47 ms, 1.493 ms] -
appsec 3.828 ms [3.602 ms, 4.055 ms] 2.347 ms (158.4%)
iast 2.277 ms [2.208 ms, 2.346 ms] 796.012 µs (53.7%)
iast_GLOBAL 2.314 ms [2.245 ms, 2.383 ms] 832.58 µs (56.2%)
profiling 2.547 ms [2.33 ms, 2.763 ms] 1.065 ms (71.9%)
tracing 2.068 ms [2.015 ms, 2.121 ms] 586.444 µs (39.6%)

@ygree ygree force-pushed the ygree/llmobs-systest-fixes branch from 5cd257e to cbd6226 Compare February 24, 2026 09:31
@ygree ygree changed the title llmobs: set model tag even when llmobs disabled fix(llmobs): set model tag even when llmobs disabled Mar 2, 2026
ygree added 23 commits March 2, 2026 13:30
…wthTestOpenAiLlmInteractions::test_completion
…d with python openai instrumentation and system-tests
… with variables + chat_template, longest-first overlap handling) and support map-based LLM input serialization (messages + prompt) in LLMObs mapper. Also filter empty instruction messages to match system-test expectations.
…st and return [image] (not empty) when stripped input_image URLs are missing, aligning mixed-input chat_template output with expected behavior.
…output.messages from request params so existing error-span tests pass.
…JSON argument parsing and remove duplicate manual parsing logic from ResponseDecorator.
@ygree ygree changed the title fix(llmobs): set model tag even when llmobs disabled fix(llmobs): openai-java payload mapping for responses, tool metadata, and prompt tracking Mar 6, 2026
@ygree ygree added tag: ai generated Largely based on code generated by an AI or LLM tag: no release notes Changes to exclude from release notes labels Mar 6, 2026
@ygree ygree marked this pull request as ready for review March 6, 2026 13:46
@ygree ygree requested review from a team as code owners March 6, 2026 13:46
@ygree
Copy link
Contributor Author

ygree commented Mar 17, 2026

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0c879ba692

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ygree ygree requested a review from a team as a code owner March 20, 2026 00:37
@ygree ygree requested review from amarziali and removed request for a team March 20, 2026 00:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: mlobs ML Observability (LLMObs) tag: ai generated Largely based on code generated by an AI or LLM tag: no release notes Changes to exclude from release notes type: bug Bug report and fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant