From 0ff8f04052d65d04335e8cc9d4b66fc18a4e9cfd Mon Sep 17 00:00:00 2001 From: May Lee Date: Mon, 15 Jun 2026 14:48:46 -0400 Subject: [PATCH 1/6] add recent worker upgrades --- .../guide/upgrade_worker.md | 136 +++++++++++++++++- 1 file changed, 135 insertions(+), 1 deletion(-) diff --git a/content/en/observability_pipelines/guide/upgrade_worker.md b/content/en/observability_pipelines/guide/upgrade_worker.md index 1237e90fd8c..412bb83deb7 100644 --- a/content/en/observability_pipelines/guide/upgrade_worker.md +++ b/content/en/observability_pipelines/guide/upgrade_worker.md @@ -1,6 +1,6 @@ --- title: Upgrade the Worker Guide -description: Learn about new features, enhancements, and fixes for Worker versions 2.7 to 2.15. +description: Learn about new features, enhancements, and fixes for Worker versions 2.7 to 2.17. disable_toc: false aliases: - /observability_pipelines/guide/upgrade_worker_2_7/ @@ -14,6 +14,135 @@ Datadog recommends updating the Observability Pipelines Worker (OPW) with every This guide goes over how to upgrade to a specific Worker version and the updates for that version. +## Worker version 2.17.0 + +To upgrade to Worker version 2.17.0: + +- Docker: Run the `docker pull` command for the [2.17.0 image][46]. +- Kubernetes: See the [Helm chart][2] and [Upgrade the Worker][37]. +- APT: Run the command `apt-get install observability-pipelines-worker=2.17.0`. +- RPM: Run the command `sudo yum install observability-pipelines-worker-2.17.0`. + +
Breaking change: The Generate Metrics processor now emits the raw field name as the tag key for group_by entries that contain characters outside of [a-zA-Z0-9_@], such as http.status_code and kube-cronjob. Previously, the processor added a trailing underscore (_) to tag keys that contained those characters. For example, the tag key http.status_code became http.status_code_ because the key contained a period (.).

After upgrading to Worker 2.17, update any dashboards, monitors, alerts, or notebooks that use tag keys with the trailing underscore to use those tag keys without the underscore. Historical datapoints remain queryable under the old tag key. Pipelines whose group_by entries only use names matching [a-zA-Z0-9_@]+ are not affected.
+ +Worker version 2.17.0 gives you access to the following: + +#### New features + +- For the Custom Processor: + - VRL string literals now support `\u{HEX}` Unicode escape sequences. Any valid Unicode scalar value can be expressed, such as `"hello\u{1F30E}world"`. Invalid sequences, such as empty braces, non-hex digits, surrogate codepoints, or values above U+10FFFF, are reported as compile-time errors. + - The `parse_regex` function now accepts dynamic regex patterns (variables and runtime expressions), consistent with `parse_regex_all`. When the pattern is a literal, return type information remains precise based on named capture groups. + +#### Enhancements + +- The Generate Metrics processor's performance has been improved to reduce CPU overhead when using `group_by` tag labels. +- For the Custom Processor, the performance of `parse_regex_all` has been improved by reusing the compiled regex across invocations. +- The Splunk HEC source now supports the `enabled` field on `valid_tokens` entries. Disabled tokens are excluded from authentication and enrichment. + +#### Fixes + +- HTTP metrics emitted by the Quota processor's background sync now have correct component tags (`component_kind:transform`, `component_type:quota`, `component_id:quota_global_state`). +- The Enrichment Table processor using Reference Tables now skips sending empty event batches, preventing fatal errors with disk buffers. +- The Generate Metrics processor now uses a static component ID so that associated metrics share the same `component_id` across workers and restarts. +- An issue with parsing filter queries with whitespaces inside parentheses, such as `service:( web OR api )`, has been fixed. +- An issue with Live Capture for sources with multiple named output ports, such as the OpenTelemetry source, has been fixed. +- An issue where a Worker crash can occur if a source or a processor sends an empty event batch to the next component has been fixed. +- For the Custom Processor: + - Error messages and unused variable diagnostics have been fixed. The processor now reports every unhandled error in a single compilation. + - You can now use `SCREAMING_SNAKE` case in functions such as `pascalcase` and `camelcase`. + - The `encode_proto` and `parse_proto` functions now support proto maps whose keys are integers or Booleans, not only strings. + +--- + +## Worker version 2.16.1 + +To upgrade to Worker version 2.16.1: + +- Docker: Run the `docker pull` command for the [2.16.1 image][45]. +- Kubernetes: See the [Helm chart][2] and [Upgrade the Worker][37]. +- APT: Run the command `apt-get install observability-pipelines-worker=2.16.1`. +- RPM: Run the command `sudo yum install observability-pipelines-worker-2.16.1`. + +Worker version 2.16.1 gives you access to the following: + +#### Fixes + +- The Worker no longer fails when fetching configurations in non-US1 datacenters. + +--- + +## Worker version 2.16.0 + +To upgrade to Worker version 2.16.0: + +- Docker: Run the `docker pull` command for the [2.16.0 image][44]. +- Kubernetes: See the [Helm chart][2] and [Upgrade the Worker][37]. +- APT: Run the command `apt-get install observability-pipelines-worker=2.16.0`. +- RPM: Run the command `sudo yum install observability-pipelines-worker-2.16.0`. + +Worker version 2.16.0 gives you access to the following: + +#### New features + +- [Databricks Zerobus][47] destination: Send log data to Databricks Unity Catalog tables using the Zerobus ingestion service. The Databricks destination supports OAuth 2.0 authentication, automatic schema fetching from Unity Catalog, and protobuf batch encoding. +- The Splunk HEC source now accepts an optional `valid_tokens` list for token-based authentication managed using environment variables or a configured secrets backend. +- The Splunk HEC source now supports enriching incoming log events using VRL decoder. +- The Amazon S3 destination now supports Apache Parquet batch encoding with flexible schema definitions and configurable compression (Snappy, ZSTD, GZIP, LZ4, or none). +- For the Custom Processor: + - The `encode_proto` function now accepts an `allow_lossy_string_coercion` argument. + - Protobuf encoding now coerces compatible scalar types into the target field type: integers and strings are accepted for `bool` fields, and integers are accepted for `float`/`double` fields. + +#### Enhancements + +- mTLS support has been added to the following sources: Fluent, Logstash, OpenTelemetry (logs and metrics), Splunk HEC, Splunk TCP, HTTP/S Server, Socket, and Syslog. +- Live Capture events now include the UUID of the Worker that sent the event. +- The Reference Tables processor's buffer now emits `buffer_size_events` and `buffer_size_bytes` gauge metrics to replace the deprecated `buffer_events` and `buffer_byte_size` metrics. The deprecated metrics are still being sent for backward compatibility. +- The Datadog Logs destination now supports overrides for per-site logs endpoint using the bootstrap `logs-sites` field or `DD_OP_LOGS_` environment variable, such as `DD_OP_LOGS_US1`. Environment variables take precedence over the bootstrap file. +- Sources now record these distribution metrics: + - `source_send_latency_seconds`: Time spent blocking on a single events chunk send operation + - `source_send_batch_latency_seconds`: Encompassing all chunks within a received events batch + +#### Fixes + +- A race condition in the Reference Tables processor has been fixed to prevent dropping buffered events during a Worker shutdown. +- Issues that have been fixed for Live Capture: + - Captured events in Live Capture were dropped when too many events were sent at the same time. + - Metrics events sent by the Generate Metrics processors were dropped when there was a log processor after the Generate Metrics processor. +- The Worker no longer logs `Root metadata expired` or `potential freeze attack` on startup after refreshing embedded Remote Config trusted-root metadata. +- The Splunk HEC source now emits `authentication_failed` as the `error_type` in error logs and metrics when authentication fails due to a missing or invalid authorization header. +- Fixed the Datadog Logs destination healthcheck endpoint computation to preserve site prefixes, such as `us3.`, `us5.`, `ap1.`, when deriving the API URL from intake endpoints. +- An issue where a destination with a configured disk buffer would stall for `batch.timeout_sec` before gracefully reloading has been fixed. This fix also resolves cases where the Worker ignored SIGINT during a pipeline stall. +- Fixed the Custom Processor so an `else` or `else if` keyword can be on a new line after the closing curly brace (`}`) of an `if`-block. + +--- + +## Worker version 2.15.1 + +To upgrade to Worker version 2.15.1: + +- Docker: Run the `docker pull` command for the [2.15.1 image][43]. +- Kubernetes: See the [Helm chart][2] and [Upgrade the Worker][37]. +- APT: Run the command `apt-get install observability-pipelines-worker=2.15.1`. +- RPM: Run the command `sudo yum install observability-pipelines-worker-2.15.1`. + +Worker version 2.15.1 gives you access to the following: + +#### New features + +- The following VRL functions are now available for the Custom Processor: `to_entries`, `from_entries`, and `with_entries` for converting between objects and arrays of key-value pairs (jq-style). + +#### Enhancements + +- The Datadog Metrics destination now uses zstd compression for Datadog API v2 and sketches endpoints, reducing CPU, memory, and bandwidth usage. +- The `flatten` function in the Custom Processor now accepts an optional `except` argument to exclude specific keys from flattening. + +#### Fixes + +- A performance regression in buffer metrics tracking has been fixed. +- The Datadog Agent source has been fixed to preserve the `device` tag from Datadog API v2 resources. + +--- + ## Worker version 2.15.0 To upgrade to Worker version 2.15.0: @@ -459,3 +588,8 @@ Worker version 2.7.0 gives you access to the following: [40]: https://hub.docker.com/r/datadog/observability-pipelines-worker/tags?name=2.15.0 [41]: https://yaml.org/spec/1.1/ [42]: https://hub.docker.com/r/datadog/observability-pipelines-worker/tags?name=2.14.1 +[43]: https://hub.docker.com/r/datadog/observability-pipelines-worker/tags?name=2.15.1 +[44]: https://hub.docker.com/r/datadog/observability-pipelines-worker/tags?name=2.16.0 +[45]: https://hub.docker.com/r/datadog/observability-pipelines-worker/tags?name=2.16.1 +[46]: https://hub.docker.com/r/datadog/observability-pipelines-worker/tags?name=2.17.0 +[47]: /observability_pipelines/destinations/databricks/ From e5a0b157b147ea408c71a9af052331b3569849eb Mon Sep 17 00:00:00 2001 From: May Lee Date: Mon, 15 Jun 2026 14:55:57 -0400 Subject: [PATCH 2/6] Update content/en/observability_pipelines/guide/upgrade_worker.md --- content/en/observability_pipelines/guide/upgrade_worker.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/observability_pipelines/guide/upgrade_worker.md b/content/en/observability_pipelines/guide/upgrade_worker.md index 412bb83deb7..3e8fd2ac2bb 100644 --- a/content/en/observability_pipelines/guide/upgrade_worker.md +++ b/content/en/observability_pipelines/guide/upgrade_worker.md @@ -86,7 +86,7 @@ Worker version 2.16.0 gives you access to the following: - [Databricks Zerobus][47] destination: Send log data to Databricks Unity Catalog tables using the Zerobus ingestion service. The Databricks destination supports OAuth 2.0 authentication, automatic schema fetching from Unity Catalog, and protobuf batch encoding. - The Splunk HEC source now accepts an optional `valid_tokens` list for token-based authentication managed using environment variables or a configured secrets backend. -- The Splunk HEC source now supports enriching incoming log events using VRL decoder. +- The Splunk HEC source now supports enriching incoming log events using the VRL decoder. - The Amazon S3 destination now supports Apache Parquet batch encoding with flexible schema definitions and configurable compression (Snappy, ZSTD, GZIP, LZ4, or none). - For the Custom Processor: - The `encode_proto` function now accepts an `allow_lossy_string_coercion` argument. From 37f8b284801a71b0592d53ad13638ada2cf7c2db Mon Sep 17 00:00:00 2001 From: May Lee Date: Mon, 15 Jun 2026 15:15:08 -0400 Subject: [PATCH 3/6] apply suggestions --- .../guide/upgrade_worker.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/content/en/observability_pipelines/guide/upgrade_worker.md b/content/en/observability_pipelines/guide/upgrade_worker.md index 3e8fd2ac2bb..73b36fd28a7 100644 --- a/content/en/observability_pipelines/guide/upgrade_worker.md +++ b/content/en/observability_pipelines/guide/upgrade_worker.md @@ -44,9 +44,9 @@ Worker version 2.17.0 gives you access to the following: - HTTP metrics emitted by the Quota processor's background sync now have correct component tags (`component_kind:transform`, `component_type:quota`, `component_id:quota_global_state`). - The Enrichment Table processor using Reference Tables now skips sending empty event batches, preventing fatal errors with disk buffers. - The Generate Metrics processor now uses a static component ID so that associated metrics share the same `component_id` across workers and restarts. -- An issue with parsing filter queries with whitespaces inside parentheses, such as `service:( web OR api )`, has been fixed. -- An issue with Live Capture for sources with multiple named output ports, such as the OpenTelemetry source, has been fixed. -- An issue where a Worker crash can occur if a source or a processor sends an empty event batch to the next component has been fixed. +- A parsing filter queries issue with whitespace inside parentheses, such as `service:( web OR api )`, has been fixed. +- Live Capture now works correctly for sources with multiple named output ports, such as the OpenTelemetry source, has been fixed. +- An issue where a Worker crash could occur if a source or a processor sends an empty event batch to the next component has been fixed. - For the Custom Processor: - Error messages and unused variable diagnostics have been fixed. The processor now reports every unhandled error in a single compilation. - You can now use `SCREAMING_SNAKE` case in functions such as `pascalcase` and `camelcase`. @@ -90,7 +90,7 @@ Worker version 2.16.0 gives you access to the following: - The Amazon S3 destination now supports Apache Parquet batch encoding with flexible schema definitions and configurable compression (Snappy, ZSTD, GZIP, LZ4, or none). - For the Custom Processor: - The `encode_proto` function now accepts an `allow_lossy_string_coercion` argument. - - Protobuf encoding now coerces compatible scalar types into the target field type: integers and strings are accepted for `bool` fields, and integers are accepted for `float`/`double` fields. + - Protobuf encoding now coerces compatible scalar types into the target field type: integers and strings are accepted for `bool` fields, and integers are accepted for `float` or `double` fields. #### Enhancements @@ -105,12 +105,12 @@ Worker version 2.16.0 gives you access to the following: #### Fixes - A race condition in the Reference Tables processor has been fixed to prevent dropping buffered events during a Worker shutdown. -- Issues that have been fixed for Live Capture: - - Captured events in Live Capture were dropped when too many events were sent at the same time. - - Metrics events sent by the Generate Metrics processors were dropped when there was a log processor after the Generate Metrics processor. +- The follow Live Capture issues have been fixed: + - Live Capture events were dropped when too many events were sent at the same time. + - Metrics events sent by the Generate Metrics processors were dropped in Live Capture when there was a log processor after the Generate Metrics processor. - The Worker no longer logs `Root metadata expired` or `potential freeze attack` on startup after refreshing embedded Remote Config trusted-root metadata. - The Splunk HEC source now emits `authentication_failed` as the `error_type` in error logs and metrics when authentication fails due to a missing or invalid authorization header. -- Fixed the Datadog Logs destination healthcheck endpoint computation to preserve site prefixes, such as `us3.`, `us5.`, `ap1.`, when deriving the API URL from intake endpoints. +- Fixed the Datadog Logs destination health check endpoint computation to preserve site prefixes, such as `us3.`, `us5.`, `ap1.`, when deriving the API URL from intake endpoints. - An issue where a destination with a configured disk buffer would stall for `batch.timeout_sec` before gracefully reloading has been fixed. This fix also resolves cases where the Worker ignored SIGINT during a pipeline stall. - Fixed the Custom Processor so an `else` or `else if` keyword can be on a new line after the closing curly brace (`}`) of an `if`-block. From d6ac4e9919394e6c8b5fe505a8c7ded2ae2d9e9a Mon Sep 17 00:00:00 2001 From: May Lee Date: Mon, 15 Jun 2026 15:20:36 -0400 Subject: [PATCH 4/6] small edits --- content/en/observability_pipelines/guide/upgrade_worker.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/en/observability_pipelines/guide/upgrade_worker.md b/content/en/observability_pipelines/guide/upgrade_worker.md index 73b36fd28a7..fa2bece8c1e 100644 --- a/content/en/observability_pipelines/guide/upgrade_worker.md +++ b/content/en/observability_pipelines/guide/upgrade_worker.md @@ -44,8 +44,8 @@ Worker version 2.17.0 gives you access to the following: - HTTP metrics emitted by the Quota processor's background sync now have correct component tags (`component_kind:transform`, `component_type:quota`, `component_id:quota_global_state`). - The Enrichment Table processor using Reference Tables now skips sending empty event batches, preventing fatal errors with disk buffers. - The Generate Metrics processor now uses a static component ID so that associated metrics share the same `component_id` across workers and restarts. -- A parsing filter queries issue with whitespace inside parentheses, such as `service:( web OR api )`, has been fixed. -- Live Capture now works correctly for sources with multiple named output ports, such as the OpenTelemetry source, has been fixed. +- An issue with parsing filter queries that contain whitespace inside parentheses, such as `service:( web OR api )`, has been fixed. +- Live Capture now works correctly for sources with multiple named output ports, such as the OpenTelemetry source. - An issue where a Worker crash could occur if a source or a processor sends an empty event batch to the next component has been fixed. - For the Custom Processor: - Error messages and unused variable diagnostics have been fixed. The processor now reports every unhandled error in a single compilation. @@ -105,7 +105,7 @@ Worker version 2.16.0 gives you access to the following: #### Fixes - A race condition in the Reference Tables processor has been fixed to prevent dropping buffered events during a Worker shutdown. -- The follow Live Capture issues have been fixed: +- The following Live Capture issues have been fixed: - Live Capture events were dropped when too many events were sent at the same time. - Metrics events sent by the Generate Metrics processors were dropped in Live Capture when there was a log processor after the Generate Metrics processor. - The Worker no longer logs `Root metadata expired` or `potential freeze attack` on startup after refreshing embedded Remote Config trusted-root metadata. From e74dc7367dc09ea3ad42cfd7f856bb0306ef51f5 Mon Sep 17 00:00:00 2001 From: May Lee Date: Mon, 15 Jun 2026 15:21:39 -0400 Subject: [PATCH 5/6] one more small edit --- content/en/observability_pipelines/guide/upgrade_worker.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/observability_pipelines/guide/upgrade_worker.md b/content/en/observability_pipelines/guide/upgrade_worker.md index fa2bece8c1e..d95fea4e7eb 100644 --- a/content/en/observability_pipelines/guide/upgrade_worker.md +++ b/content/en/observability_pipelines/guide/upgrade_worker.md @@ -107,7 +107,7 @@ Worker version 2.16.0 gives you access to the following: - A race condition in the Reference Tables processor has been fixed to prevent dropping buffered events during a Worker shutdown. - The following Live Capture issues have been fixed: - Live Capture events were dropped when too many events were sent at the same time. - - Metrics events sent by the Generate Metrics processors were dropped in Live Capture when there was a log processor after the Generate Metrics processor. + - Metrics events sent by the Generate Metrics processor were dropped in Live Capture when there was a log processor after the Generate Metrics processor. - The Worker no longer logs `Root metadata expired` or `potential freeze attack` on startup after refreshing embedded Remote Config trusted-root metadata. - The Splunk HEC source now emits `authentication_failed` as the `error_type` in error logs and metrics when authentication fails due to a missing or invalid authorization header. - Fixed the Datadog Logs destination health check endpoint computation to preserve site prefixes, such as `us3.`, `us5.`, `ap1.`, when deriving the API URL from intake endpoints. From c542cd5cd54b156648fdedf2ca7a35d59e1dcc81 Mon Sep 17 00:00:00 2001 From: May Lee Date: Mon, 15 Jun 2026 17:18:51 -0400 Subject: [PATCH 6/6] Update content/en/observability_pipelines/guide/upgrade_worker.md --- content/en/observability_pipelines/guide/upgrade_worker.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/en/observability_pipelines/guide/upgrade_worker.md b/content/en/observability_pipelines/guide/upgrade_worker.md index d95fea4e7eb..617100c7f66 100644 --- a/content/en/observability_pipelines/guide/upgrade_worker.md +++ b/content/en/observability_pipelines/guide/upgrade_worker.md @@ -99,8 +99,8 @@ Worker version 2.16.0 gives you access to the following: - The Reference Tables processor's buffer now emits `buffer_size_events` and `buffer_size_bytes` gauge metrics to replace the deprecated `buffer_events` and `buffer_byte_size` metrics. The deprecated metrics are still being sent for backward compatibility. - The Datadog Logs destination now supports overrides for per-site logs endpoint using the bootstrap `logs-sites` field or `DD_OP_LOGS_` environment variable, such as `DD_OP_LOGS_US1`. Environment variables take precedence over the bootstrap file. - Sources now record these distribution metrics: - - `source_send_latency_seconds`: Time spent blocking on a single events chunk send operation - - `source_send_batch_latency_seconds`: Encompassing all chunks within a received events batch + - `source_send_latency_seconds`: The time it takes for the source to send a chunk of events to the next component. + - `source_send_batch_latency_seconds`: The time it takes for the source to send a batch, which can contain multiple event chunks, to the next component. #### Fixes