Skip to content

feat(enrichment tables): add bloom filter to memory table#25154

Draft
esensar wants to merge 6 commits intovectordotdev:masterfrom
esensar:feature/memory-enrichment-table-bloom-filter
Draft

feat(enrichment tables): add bloom filter to memory table#25154
esensar wants to merge 6 commits intovectordotdev:masterfrom
esensar:feature/memory-enrichment-table-bloom-filter

Conversation

@esensar
Copy link
Copy Markdown
Contributor

@esensar esensar commented Apr 9, 2026

Summary

This adds support for bloom filters in memory enrichment tables, similar to cuckoo filter, but
simpler and supporting less features (no TTL, no deletion).

Vector configuration

enrichment_tables:
  bloom_table:
    type: memory
    ttl: 60
    flush_interval: 5
    scan_interval: 10
    inputs: ["bloom_generator"]
    filter:
      type: bloom
      max_entries: 100000

sources:
  data_for_table:
    type: file
    include: ["/etc/vector/vector-bloom-memory-example-input.jsonl"]

  stdin_data:
    type: stdin

transforms:
  bloom_reader:
    type: "remap"
    inputs: ["stdin_data"]
    source: |
      key = .message

      existing, err = get_enrichment_table_record("bloom_table", { "key": key })

      if err == null {
        . = existing
      } else {
        .message = "Key not found"
      }

  bloom_generator:
    type: "remap"
    inputs: ["data_for_table"]
    source: |
      data = parse_json!(.message)
      . = set!(value: {}, path: [get!(data, path: ["key"])], data: {})

sinks:
  console:
    inputs: ["bloom_reader"]
    target: "stdout"
    type: "console"
    encoding:
      codec: "json"

How did you test this PR?

Ran the above configuration and checked for keys through the stdin source. Also added some tests.

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details on the dd-rust-license-tool.

Sponsored by Quad9

esensar added 5 commits April 8, 2026 17:32
This adds support for cuckoo filters in memory enrichment tables, to support use cases
where only presence of a key needs to be checked and false positives are acceptable, greatly
improving memory usage compared to regular memory tables.
This adds support for bloom filters in memory enrichment tables, similar to cuckoo filter, but
simpler and supporting less features (no TTL, no deletion).
@esensar
Copy link
Copy Markdown
Contributor Author

esensar commented Apr 9, 2026

I will rebase this to master after #25143 gets merged, so this is a draft for now.

@github-actions github-actions bot added the domain: external docs Anything related to Vector's external, public documentation label Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: external docs Anything related to Vector's external, public documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant