Flink: SQL: Make Dynamic sink options to be configurable in SQL by swapna267 · Pull Request #15780 · apache/iceberg

swapna267 · 2026-03-26T19:29:38Z

Support following configs to be configurable from SQL for dynamic sink.

Fallback to writeproperties or Flink configuration if following are not set on DynamicRecord,

writeParallelism(int) → FlinkWriteOptions.WRITE_PARALLELISM
distributionMode -> FlinkWriteOptions.DISTRIBUTION_MODE
toBranch(String) → FlinkWriteOptions.BRANCH

Provide options to configure following behavior of Dynamic Sink in SQl

cacheMaxSize(int)
immediateTableUpdate(boolean)
dropUnusedColumns(boolean)
cacheRefreshMs(long)
inputSchemasPerTableCacheMaxSize(int)
caseSensitive(boolean)

More context here, #15471 (comment)

…d flinkconfig

mxm

Thanks @swapna267!

mxm · 2026-03-27T08:52:39Z

flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicSinkConf.java

+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.flink;


Should probably be in the dynamic package. Or should we create a config package?

mxm · 2026-03-27T08:52:51Z

flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicSinkOptions.java

+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.flink;


mxm · 2026-03-27T09:00:41Z

flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicIcebergSink.java

+    FlinkDynamicSinkConf flinkDynamicSinkConf =
+        new FlinkDynamicSinkConf(writeProperties, flinkConfig);


Can we directly pass FlinkDynamicSinkConf to the constructor?

mxm · 2026-03-27T09:01:55Z

flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicIcebergSink.java

+      writeOptions.put(
+          FlinkDynamicSinkOptions.IMMEDIATE_TABLE_UPDATE.key(),
+          Boolean.toString(newImmediateUpdate));


I'm not sure this should go into WriteOptions. I think it is better to have a separate config for DynamicSink options.

With all of them written into WriteOptions, it is easier to pass these configs from SQL by using setAll(Map<String, String> properties) for DynamicIcebergSink initialization.

If we separate them, we either need to handle it in setAll or upstream users need to provide them separately.

As Dynamic sink configs are scoped with prefix dynamic-sink , should be ok to go in same map ?

Thinking about it again, it's probably ok.

mxm · 2026-03-27T09:03:25Z

...k/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicRecordProcessor.java

+    FlinkDynamicSinkConf flinkDynamicSinkConf =
+        new FlinkDynamicSinkConf(writeProperties, flinkConfig);


Could we create the config only once and pass it to the constructor?

Did and removed for consistency.

DynamicRecordProcessor needs FlinkDynamicSinkConf and also writeProperties/flinkConfig.

WriteProperties and FlinkConfig are required to create FlinkWriteConf in Open as it's not serializable.

May be, I can simply pass FlinkDynamicSinkConf also along with writeProperties/flinkConfig.

mxm · 2026-03-27T09:07:06Z

.../v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicRecordWithConfig.java

+    if (super.writeParallelism() != Integer.MAX_VALUE) {
+      return super.writeParallelism();
+    }


Not sure about this logic. The default for writeParallelism is 0.

https://github.com/apache/iceberg/blob/main/flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicRecord.java#L51-L71

DynamicRecord constructor has writeParallelism as primitive int. And we basically are using Integer.MAX_VALUE to fall back to other value like job parallelism.

* @param writeParallelism The number of parallel writers. Can be set to any value {@literal > 0}, * but will always be automatically capped by the maximum write parallelism, which is the * parallelism of the sink. Set to Integer.MAX_VALUE for always using the maximum available * write parallelism.

we have similar issue with upsertMode as it's using primitive boolean.

True, the user supplies this value via the constructor. I initially thought that we were calling the non-existing empty constructor.

Regardless, I'm not sure about this change. The docs state, that Integer.MAX_VALUE means to use the max write parallelism, but we cap it on the writeParallelism supplied by the FlinkWriteConf, that does not seem right.

I think the only way to address this is to use a value like 0 to trigger loading from FlinkWriteConf.

If not overridden in FlinkWriteConf we still use Integer.MAX_VALUE .
But I agree on falling back to config, when explicitly set to <=0

mxm · 2026-03-27T09:11:10Z

...2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicRecordWithDefaults.java

+ * have fields set.
+ */
+@Internal
+class DynamicRecordWithDefaults extends DynamicRecord {


Suggested change

class DynamicRecordWithDefaults extends DynamicRecord {

class DynamicRecordWithConfig extends DynamicRecord {

Should we add a test to verify config handling?

mxm · 2026-03-30T12:46:31Z

.../v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicRecordWithConfig.java

+    if (super.writeParallelism() != Integer.MAX_VALUE) {
+      return super.writeParallelism();
+    }


True, the user supplies this value via the constructor. I initially thought that we were calling the non-existing empty constructor.

Regardless, I'm not sure about this change. The docs state, that Integer.MAX_VALUE means to use the max write parallelism, but we cap it on the writeParallelism supplied by the FlinkWriteConf, that does not seem right.

I think the only way to address this is to use a value like 0 to trigger loading from FlinkWriteConf.

mxm · 2026-03-30T12:52:12Z

.../v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicRecordWithConfig.java

+    super(
+        source.tableIdentifier(),
+        source.branch(),
+        source.schema(),
+        source.rowData(),
+        source.spec(),
+        source.distributionMode(),
+        source.writeParallelism());


Could we just resolve the adjusted arguments here? We wouldn't have to overwrite all the methods then. Seems cleaner to me.

mxm · 2026-03-30T12:55:22Z

...k/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicRecordProcessor.java

  @Override
-  public void collect(DynamicRecord data) {
+  public void collect(DynamicRecord inputData) {
+    DynamicRecordWithDefaults data = new DynamicRecordWithDefaults(inputData, flinkWriteConf);


I don't like that we are creating another copy DynamicRecord here (DynamicRecordWithDefaults). Only a couple lines later, we also create DynamicRecordInternal. Could we simply resolve the parts of DynamicRecord which can be overridden? Only a subset of the values are actually overridable.

I started that way, but felt that approach was little error prone for future , as configs of DynamicRecord need to be overridden at different places for different usages.

Write Parallelism / Distribution mode overrides need to be passed over to HashKeyGenerator.
And overrides like branch/upsertMode should be passed to DynamicRecordInternal.

So the approach would look like,

Build HashKeyGenerator with defaults for Write Parallelism / Distribution mode .

Update all DynamicRecordInternal construction paths for defaults for Branch.

Or another option could be to incorporate this into DynamicRecord itself by having setConfig(FlinkWriteConf flinkWriteConf)

mxm · 2026-03-30T12:57:10Z

flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicIcebergSink.java

+      writeOptions.put(
+          FlinkDynamicSinkOptions.IMMEDIATE_TABLE_UPDATE.key(),
+          Boolean.toString(newImmediateUpdate));


Thinking about it again, it's probably ok.

mxm · 2026-03-30T12:58:30Z

@swapna267 I would like to get rid of DynamicRecordWithDefaults if possible. See #15780 (comment).

swapna267 added 2 commits March 26, 2026 11:11

Make dynamic sink configurations configurable from writeproperties an…

0f0865b

…d flinkconfig

fix fallback for parallelism

ffcb42c

github-actions bot added the flink label Mar 26, 2026

mxm reviewed Mar 27, 2026

View reviewed changes

mxm reviewed Mar 30, 2026

View reviewed changes

swapna267 added 2 commits March 30, 2026 18:39

pr feedback

b354ebc

fix test

6bc941e

		FlinkDynamicSinkConf flinkDynamicSinkConf =
		new FlinkDynamicSinkConf(writeProperties, flinkConfig);

	class DynamicRecordWithDefaults extends DynamicRecord {
	class DynamicRecordWithConfig extends DynamicRecord {

Conversation

swapna267 commented Mar 26, 2026

Uh oh!

mxm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

swapna267 Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

swapna267 Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mxm commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

swapna267 Mar 27, 2026 •

edited

Loading

swapna267 Mar 30, 2026 •

edited

Loading