Add sourcemap upload failure metrics#392
Conversation
e8766cd to
4376beb
Compare
Add metrics for Error Tracking sourcemap upload retries and final upload failures. These make intake instability observable without parsing CI logs. Emit the metrics only when the existing metrics plugin is enabled, preserving the default behavior for customers who use Error Tracking without build metrics. Batch metric emission after each upload phase and keep metric publishing non-fatal so sourcemap uploads are not affected by telemetry failures. Include tags for bundler, plugin version, service, site, status code, error type, and CI job context when available.
4376beb to
6a47213
Compare
🎉 All green!🧪 All tests passed 🔗 Commit SHA: d1d3c15 | Docs | Datadog PR Page | Give us feedback! |
Replace the sourcemap-specific metrics sender with a shared internal metrics collector exposed on the global context. This lets error-tracking add upload retry and failure metrics while the metrics plugin still owns filtering, prefixing, hooks, and sending. Flush any collected metrics at true end so late sourcemap uploads keep the old awaited send behavior.
2b943c4 to
d1d3c15
Compare
| `bundler:${context.bundlerName}`, | ||
| `plugin_version:${context.version}`, | ||
| `service:${options.service}`, | ||
| `site:${context.site}`, | ||
| ...(process.env.CI_JOB_NAME ? [`jobname:${normalizeTagValue(process.env.CI_JOB_NAME)}`] : []), | ||
| ...(process.env.BRANCH_TYPE | ||
| ? [`branchtype:${normalizeTagValue(process.env.BRANCH_TYPE)}`] | ||
| : []), |
There was a problem hiding this comment.
I feel like these should be default for all metrics we send through the metrics plugin.
Or at least, it feels weird to me to have different metrics definition between base metrics and these new metrics.
It should be made through an unified pipeline.
Also, CI_JOB_NAME and BRANCH_TYPE are pretty custom to be included here.
Aren't all of these already handled on the user's side (web-ui)?
There was a problem hiding this comment.
A bunch of these helpers are only used for sourcemaps metrics but are made in a generic way, as-in are not really sourcemaps specific.
They should instead be part of the main metrics pipeline we already have.
So all the metrics we send can benefit from it.
| const allMetrics = new Set([...universalMetrics, ...pluginMetrics, ...loaderMetrics]); | ||
| const allMetrics = new Set([ | ||
| ...universalMetrics, | ||
| ...pluginMetrics, | ||
| ...loaderMetrics, | ||
| ...stores.metrics, | ||
| ]); | ||
| stores.metrics.clear(); |
There was a problem hiding this comment.
I don't think we should change this part, and only rely on the "flush" mechanism for these metrics added at build time.
| async asyncTrueEnd() { | ||
| await sendCollectedMetrics(); | ||
| }, |
There was a problem hiding this comment.
That's great to have this flush mechanism here.
But it reveals a design flaw in how we run the asyncTrueEnd hook.
Right now, the sourcemaps are also uploaded part of the asyncTrueEnd hook, so we may end up with a race condition as there is no enforced order during the trueEnd hooks.
Maybe we could introduce a new flush hook that would execute after the asyncTrueEnd/syncTrueEnd hooks to ensure there is no race condition for this specific use case.
It conflicts with the trueEnd semantics though 😔 but I don't really have a better idea for this.

What and why?
Add opt-in Datadog metrics for Error Tracking sourcemap upload retries and final upload failures. These metrics make transient sourcemap intake issues and exhausted retry failures observable without parsing CI logs.
The metrics are emitted only when the existing
metricsplugin is enabled, so customers using Error Tracking without build metrics keep the same default experience.How?
Sourcemap uploads record retry and failure counts in memory during each upload phase, then add those counts to a shared internal metrics collector on the global context.
The metrics plugin owns the actual metrics pipeline: it includes collected upload metrics in the existing
metrics: {}payload, applies the normal filtering, prefixing, tags, custom metrics hook, and Datadog send behavior, and awaits the send. It also flushes any metrics collected late at true end so fallback sourcemap uploads keep the same awaited behavior.Metrics added before the metrics plugin prefix is applied:
sourcemaps.upload.retrysourcemaps.upload.failureTags include bundler, plugin version, service, site, status code, error type, and CI job context when available.
Testing
yarn workspace @dd/metrics-plugin typecheckyarn workspace @dd/error-tracking-plugin typecheckyarn test:unit packages/plugins/error-tracking/src/sourcemaps/sender.test.ts packages/plugins/error-tracking/src/index.test.ts packages/plugins/metrics/src/index.test.ts