Skip to content

Conversation

@JeanMertz
Copy link
Contributor

Hey all 👋

For a project I am working on, I had a need for a transform similar to the exec-source, which supports converting events in a low-throughput environment using an external command-line tool. I figured you all might be interested in this new transform. If not, I'm happy to keep using this in my own fork.

Summary

Add a new transform that pipes events through external processes via stdin/stdout, enabling integration with any command-line tool.

Supports three operating modes:

  • Streaming: long-running process receiving continuous event flow
  • Scheduled: periodic batch execution on a configurable interval
  • Per Event: spawn a new process for each event with concurrency control

This transform is modeled after the exec source, but with some differences:

  • It adds the Per Event run-mode, which spawns a new process for each event with concurrency control.
  • Supports optional capturing of stderr output.
  • It allows for configuring the stdout and stderr decoding, and supports framing and log namespaces.

The implementation tries to be efficient, correct and resilient, and comes with both integration and mocked unit tests to cover most edge cases.

Vector configuration

schema:
  log_namespace: true
sources:
  in:
    type: "stdin"
    decoding:
      codec: "json"
transforms:
  stdio:
    inputs:
      - "in"
    type: "stdio"
    command: ["jq", "-c", ".foo"]
    mode: "per_event"
    stdout:
      decoding:
        codec: "json"
sinks:
  out:
    inputs:
      - "stdio"
    type: "console"
    encoding:
      codec: "json"

How did you test this PR?

$ echo '{"foo":true}' | vector -qqqq -c vector.yaml
true
$ cargo test --lib --no-default-features --features transforms-stdio transforms::stdio
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.37s
     Running unittests src/lib.rs (target/debug/deps/vector-425e5266330c4b61)

running 13 tests
test transforms::stdio::tests::test_streaming_fatal_error_no_retry ... ok
test transforms::stdio::transform::tests::test_config_validation ... ok
test transforms::stdio::tests::test_scheduled_flush_on_shutdown ... ok
test transforms::stdio::tests::test_scheduled_buffer_overflow ... ok
test transforms::stdio::tests::test_stdout_success ... ok
test transforms::stdio::tests::test_stderr_ignored ... ok
test transforms::stdio::tests::test_streaming_respawn ... ok
test transforms::stdio::tests::test_decode_error ... ok
test transforms::stdio::tests::test_per_event_concurrency ... ok
test transforms::stdio::tests::test_stderr_handling ... ok
test transforms::stdio::tests::test_per_event_concurrency_limit ... ok
test transforms::stdio::tests::test_integration ... ok
test transforms::stdio::tests::test_scheduled_process_timeout ... ok

test result: ok. 13 passed; 0 failed; 0 ignored; 0 measured; 276 filtered out; finished in 0.40s

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Add a new transform that pipes events through external processes via
stdin/stdout, enabling integration with any command-line tool.

Supports three operating modes:

- Streaming: long-running process receiving continuous event flow
- Scheduled: periodic batch execution on a configurable interval
- Per Event: spawn a new process for each event with concurrency control

This transform is modeled after the `exec` source, but with some
differences:

- It adds the `Per Event` run-mode, which spawns a new process for each
  event with concurrency control.

- Supports optional capturing of `stderr` output.

- It allows for configuring the `stdout` and `stderr` decoding, and
  supports framing and log namespaces.

The implementation tries to be efficient, correct and resilient, and
comes with both integration and mocked unit tests to cover most edge
cases.

Signed-off-by: Jean Mertz <git@jeanmertz.com>
@JeanMertz JeanMertz requested a review from a team as a code owner January 26, 2026 23:09
@github-actions github-actions bot added the domain: transforms Anything related to Vector's transform components label Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: transforms Anything related to Vector's transform components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant