feat(debugger): add special handling for very large collections/objects #6912

watson · 2025-11-13T18:17:36Z

What does this PR do?

When a snapshot is being collected as part of a Dynamic Instrumentation/Live Debugger line probe, the following new logic has been added:

Skip enumerating collections with more than 500 elements/entries. Mark with notCapturedReason: 'Large collection with too many elements (skip threshold: 500)' and record size.
Collect object with more then 500 properties the first time a probe encounters them but flag the probe to prevent future snapshot attempts for that probe.

Error reporting

For the case where an object with more than 500 properties was detected, an evaluation error will be added to the probe result, letting the user know that no future snapshots will be collected for this version of the probe within the current process. This evaluation error will also be added to all future probe results of this version of the probe within the current process.

Motivation

Handle edge cases where an object has a lot of properties, or where collections (arrays/sets/maps) has a lot of elements.

In those cases, a single Chrome DevTools Protocol Debugger.getProperties request will take a long time to serialize their properties/elements, which can siginificantly overshoot the time budget (as ongoing CDP requests are allowed to complete before aborting).

Additional Notes

The implementation in this PR has the following downsides:

The threshold of 500 was chosen by looking at avg. CDP request times on my own machine, but might not be the right number on other hardware.
A user might see a significant overhead the first time a probe tries to snapshot a very large object with a lot of properties on the same level. However, it's a one time overhead (only reappearing if the Node.js process is restarted and the probe is updated/re-initialized).
A snapshot will never contain any entries/elements from collections that has a size larger than 500 entries/elements.

watson · 2025-11-13T18:17:54Z

feat(debugger): add special handling for very large collections/objects #6912 👈 (View in Graphite)
feat(debugger): add snapshot time budget #6897 : 1 other dependent PR (#6921 )
master

This stack of pull requests is managed by Graphite. Learn more about stacking.

github-actions · 2025-11-13T18:17:58Z

Overall package size

Self size: 13.43 MB
Deduped: 113.62 MB
No deduping: 128.64 MB

Dependency sizes

| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.7.0 | 35.02 MB | 35.02 MB | | @datadog/native-appsec | 10.3.0 | 20.73 MB | 20.74 MB | | @datadog/pprof | 5.12.0 | 11.19 MB | 11.57 MB | | @datadog/native-iast-taint-tracking | 4.1.0 | 9.01 MB | 9.02 MB | | @opentelemetry/resources | 1.30.1 | 557.67 kB | 7.71 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.5.4 | 2.95 MB | 5.83 MB | | @datadog/wasm-js-rewriter | 5.0.1 | 2.82 MB | 3.53 MB | | @datadog/native-metrics | 3.1.1 | 1.02 MB | 1.43 MB | | @opentelemetry/api-logs | 0.208.0 | 199.48 kB | 1.42 MB | | @opentelemetry/api | 1.9.0 | 1.22 MB | 1.22 MB | | jsonpath-plus | 10.3.0 | 617.18 kB | 1.08 MB | | import-in-the-middle | 1.15.0 | 127.66 kB | 856.24 kB | | lru-cache | 10.4.3 | 804.3 kB | 804.3 kB | | @datadog/openfeature-node-server | 0.2.0 | 118.51 kB | 437.19 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | source-map | 0.7.6 | 185.63 kB | 185.63 kB | | pprof-format | 2.2.1 | 163.06 kB | 163.06 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | @isaacs/ttlcache | 2.1.2 | 90.79 kB | 90.79 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 7.0.5 | 63.38 kB | 63.38 kB | | istanbul-lib-coverage | 3.2.2 | 34.37 kB | 34.37 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | shell-quote | 1.8.3 | 23.74 kB | 23.74 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | mutexify | 1.4.0 | 5.71 kB | 8.74 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | module-details-from-path | 1.0.4 | 3.96 kB | 3.96 kB | | escape-string-regexp | 5.0.0 | 3.66 kB | 3.66 kB |

_{🤖 This report was automatically generated by heaviest-objects-in-the-universe}

datadog-official · 2025-11-13T18:20:43Z

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: d42205f | Docs | Datadog PR Page | Was this helpful? Give us feedback!}

codecov · 2025-11-13T18:28:04Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.94%. Comparing base (0bb1f17) to head (d42205f).

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #6912   +/-   ##
=======================================
  Coverage   84.94%   84.94%           
=======================================
  Files         514      514           
  Lines       21754    21754           
=======================================
  Hits        18478    18478           
  Misses       3276     3276

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

pr-commenter · 2025-11-14T11:45:07Z

Benchmarks

Benchmark execution time: 2025-12-02 09:53:21

Comparing candidate commit d42205f in PR branch watson/DEBUG-4727/handle-large-objects with baseline commit 0bb1f17 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 294 metrics, 26 unstable metrics.

packages/dd-trace/test/debugger/devtools_client/snapshot/target-code/max-collection-size.js

IlyasShabi · 2025-12-01T13:41:10Z

packages/dd-trace/src/debugger/devtools_client/index.js

-        snapshot.captures = {
-          lines: { [probe.location.lines[0]]: { locals: state } }
-        }
+      if (captureErrors.length > 0) {


I think it's safe if we init captureErrors with [] or use captureErrors?.length to avoid undefined behavior when numberOfProbesWithSnapshots === 0 (if this possible)

Good call! I fixed it in the latest commit 👍

IlyasShabi · 2025-12-01T14:02:34Z

packages/dd-trace/src/debugger/devtools_client/snapshot/index.js

-    // Consider if we could set errors just for the part of the scope chain that throws during collection.
-    log.error('[debugger:devtools_client] Error getting local state for call frame', err)
-    return returnError
+    if (opts.ctx.deadlineReached === true) break // TODO: Bad UX; Variables in remaining scopes are silently dropped


This should be in a finally block no?

Q: why not adding an error to captureErrors when we reach the deadline?

This should be in a finally block no?

Since we're not re-throwing inside of the catch block, having no finally block vs having a finally block is functionally equivalent: The first line of code after the end of the catch block is going to be executed after either the try or the catch block successfully ends.

Q: why not adding an error to captureErrors when we reach the deadline?

If the calling code sees any errors in the captureErrors array they will be treated as permanent errors and future capturing of snapshots for this probe will be disabled. We don't want that, as reaching the deadline is not seen as a permanent error, but just as a normal stop-gab to keep the overhead low. So we don't want to add an error to the captureErrors array in case we reach the deadline.

Handle edge cases where an object has a lot of properties, or where collections (arrays/sets/maps) has a lot of elements. In those cases, a single Chrome DevTools Protocol `Debugger.getProperties` request will take a long time to serialize their properties/elements, which can siginificantly overshoot the time budget (as ongoing CDP requests are allowed to complete before aborting) This is dealt with in the following way: - Skip enumerating large collections (arrays/maps/sets) by parsing size from descriptions; mark with notCapturedReason: "collectionSize" and record size. - Large objects are still collected the first time, but objects with >500 properties set a fatalSnapshotError to prevent future snapshot attempts for that probe and emits error to the diagnostics endpoint.

tylfin · 2025-12-02T15:32:15Z

packages/dd-trace/src/debugger/devtools_client/snapshot/collector.js

    // Trim the number of properties on the object if there's too many.
    const size = result.length
+    if (size > LARGE_OBJECT_SKIP_THRESHOLD) {
+      opts.ctx.captureErrors.push(new Error(


Out of curiosity, do these errors get bubbled to customers? It feels like this should fall into a "warning" category since there isn't really an action for them to take unless we want to disable probes that hit too many of these conditions

They are added as evaluation errors to the probe result. Both to the current probe result and to all subsequent probe results of the current probe version, as future invocations of this probe will have snapshot collection disabled. In the future we need a better way to communicate errors like these to the backend, but right now it's either as evaluation errors or as a error on the diagnostics endpoint. The latter would not work as it would be cleared the a few milliseconds later as the probe result is sent.

tylfin

LGTM

watson self-assigned this Nov 13, 2025

watson mentioned this pull request Nov 13, 2025

feat(debugger): add snapshot time budget #6897

Merged

watson added the semver-minor label Nov 13, 2025

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch from b11baaa to bae04ee Compare November 13, 2025 18:21

watson force-pushed the watson/DEBUG-4727/debugger-snapshot-time-budget branch from b9b7fa5 to ddd5ec0 Compare November 14, 2025 10:22

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch 2 times, most recently from 70a9f50 to eb9dfa6 Compare November 14, 2025 10:27

watson force-pushed the watson/DEBUG-4727/debugger-snapshot-time-budget branch from ddd5ec0 to bfd99a1 Compare November 14, 2025 10:27

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch from eb9dfa6 to c66996e Compare November 14, 2025 10:31

watson force-pushed the watson/DEBUG-4727/debugger-snapshot-time-budget branch 2 times, most recently from 65c4e4b to 4a970c8 Compare November 14, 2025 11:31

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch from c66996e to 22ae693 Compare November 14, 2025 11:31

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch 4 times, most recently from 76733ce to 96bba09 Compare November 15, 2025 05:18

watson force-pushed the watson/DEBUG-4727/debugger-snapshot-time-budget branch from 4a970c8 to 6f67548 Compare November 15, 2025 05:18

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch from 96bba09 to c680828 Compare November 15, 2025 05:41

watson force-pushed the watson/DEBUG-4727/debugger-snapshot-time-budget branch from 6f67548 to 495678c Compare November 15, 2025 05:41

watson mentioned this pull request Nov 15, 2025

fix: refactor debugger snapshot collector code #6921

Merged

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch from c680828 to f42bb33 Compare November 17, 2025 10:43

watson force-pushed the watson/DEBUG-4727/debugger-snapshot-time-budget branch from 495678c to cd88600 Compare November 17, 2025 10:43

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch from f42bb33 to f9c32b5 Compare November 17, 2025 10:51

watson force-pushed the watson/DEBUG-4727/debugger-snapshot-time-budget branch from cd88600 to 8b643eb Compare November 17, 2025 10:51

watson changed the base branch from watson/DEBUG-4727/debugger-snapshot-time-budget to graphite-base/6912 November 17, 2025 14:36

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch 5 times, most recently from 6599c43 to 7b21825 Compare November 21, 2025 12:26

watson marked this pull request as ready for review November 21, 2025 12:44

watson requested review from a team as code owners November 21, 2025 12:44

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch 2 times, most recently from 57f64ef to a1e166b Compare November 21, 2025 15:13

watson added the debugger Dynamic Instrumentation & Live Debugger label Nov 21, 2025

tylfin reviewed Nov 24, 2025

View reviewed changes

packages/dd-trace/test/debugger/devtools_client/snapshot/target-code/max-collection-size.js Outdated Show resolved Hide resolved

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch 5 times, most recently from 3644a2d to 1e27a7b Compare November 29, 2025 07:40

IlyasShabi reviewed Dec 1, 2025

View reviewed changes

watson added 5 commits December 2, 2025 09:36

Address review comments

e62b964

Send errors as evaluationErrors instead of diagnostics

288c28f

Fix some types

c52f728

Address review comments

957638a

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch from 1e27a7b to 957638a Compare December 2, 2025 08:46

Relax test depending on time passed

d42205f

watson force-pushed the watson/DEBUG-4727/handle-large-objects branch from 782a2f4 to d42205f Compare December 2, 2025 09:42

tylfin reviewed Dec 2, 2025

View reviewed changes

tylfin approved these changes Dec 2, 2025

View reviewed changes

feat(debugger): add special handling for very large collections/objects #6912

Are you sure you want to change the base?

feat(debugger): add special handling for very large collections/objects #6912

Conversation

watson commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Error reporting

Motivation

Additional Notes

Uh oh!

watson commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overall package size

Uh oh!

datadog-official bot commented Nov 13, 2025 • edited by datadog-datadog-prod-us1 bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pr-commenter bot commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

Uh oh!

IlyasShabi Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

watson Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

IlyasShabi Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

IlyasShabi Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

watson Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tylfin Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

watson Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tylfin left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

watson commented Nov 13, 2025 •

edited

Loading

watson commented Nov 13, 2025 •

edited

Loading

github-actions bot commented Nov 13, 2025 •

edited

Loading

datadog-official bot commented Nov 13, 2025 •

edited by datadog-datadog-prod-us1 bot

Loading

codecov bot commented Nov 13, 2025 •

edited

Loading

pr-commenter bot commented Nov 14, 2025 •

edited

Loading

watson Dec 2, 2025 •

edited

Loading

watson Dec 2, 2025 •

edited

Loading