enhancement(kafka source): add multithreading option for kafka source #24550
+238
−75
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This adds a new configuration parameter for
kafkasource: multithreading.When enabled, message parsing will be run in separate tasks (limited by the
max_message_handling_tasksconfiguration), to boost throughput. All results are processed in order, to ensure that acknowledgements are correctly processed.To reduce the overhead of all acknowledgements, which still caused issues even with multiple message processing threads, messages are now processed in chunks, up to CHUNK_SIZE (1000). This holds true even if multithreading is disabled. This might slightly change behavior, by not committing on each individual message, but on batches.
Vector configuration
How did you test this PR?
Ran the above configuration (based on #22958), using the producer from #22958, set to produce 300k events per second. When multithreading is disabled, the source processes ~190k messages per second (observed via
vector top). When it is enabled (with the default of 4 tasks, but even just 1 task makes an improvement) it runs at ~350k messages per second.Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Notes
@vectordotdev/vectorto reach out to us regarding this PR.pre-pushhook, please see this template.make fmtmake check-clippy(if there are failures it's possible some of them can be fixed withmake clippy-fix)make testgit merge origin masterandgit push.Cargo.lock), pleaserun
make build-licensesto regenerate the license inventory and commit the changes (if any). More details here.