Skip to content

Conversation

@leonzchang
Copy link

Which issue does this PR close?

What changes are included in this PR?

This change allows users to reuse builder instances without cloning when creating multiple writers with the same configuration.

Modification non-consuming self in build function:

  • IcebergWriterBuilder
  • RollingFileWriterBuilder
  • FileWriterBuilder

Are these changes tested?

@leonzchang leonzchang moved this to In Review ⏳ in Open Source Nov 29, 2025
Copy link
Collaborator

@CTTY CTTY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @leonzchang , thanks for the contribution! LGTM in general

There are also some clone() usages in writers under partitioning module like this.

Could you help remove them as well in this PR?

F: FileNameGenerator,
B: FileWriterBuilder + Sync,
L: LocationGenerator + Sync,
F: FileNameGenerator + Sync,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just add supertrait Sync to these traits to enforce it across different implementations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we should remove the Clone in trait implementation in builders.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, updated in 967c155.

F: FileNameGenerator,
B: FileWriterBuilder + Sync,
L: LocationGenerator + Sync,
F: FileNameGenerator + Sync,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

//!
//! #[async_trait::async_trait]
//! impl<B: IcebergWriterBuilder> IcebergWriterBuilder for LatencyRecordWriterBuilder<B> {
//! impl<B: IcebergWriterBuilder + Sync> IcebergWriterBuilder for LatencyRecordWriterBuilder<B> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason why we don't add Sync as supertrait here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CTTY This was the missing part. Thank you!

@leonzchang leonzchang requested a review from CTTY December 3, 2025 08:38

/// File writer builder trait.
pub trait FileWriterBuilder<O = DefaultOutput>: Send + Clone + 'static {
pub trait FileWriterBuilder<O = DefaultOutput>: Clone + Send + Sync + 'static {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need to keep this Clone?


/// `FileNameGeneratorTrait` used to generate file name for data file. The file name can be passed to `LocationGenerator` to generate the location of the file.
pub trait FileNameGenerator: Clone + Send + 'static {
pub trait FileNameGenerator: Clone + Send + Sync + 'static {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.


/// `LocationGenerator` used to generate the location of data file.
pub trait LocationGenerator: Clone + Send + 'static {
pub trait LocationGenerator: Clone + Send + Sync + 'static {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Do not consume self in Writer Builders

3 participants