[FLINK-39438][MaxCompute] Add support for sink operation types in MaxCompute options by dion-ricky · Pull Request #4377 · apache/flink-cdc

dion-ricky · 2026-04-15T06:39:07Z

Problem

Sink operation mode cannot be explicitly configured via pipeline YAML.
Behavior is tightly coupled with MaxCompute table type.
Schema evolution auto-creation enforces transactional tables → always upsert.
No way to override this behavior for append use cases.

Expected Behavior

Users can explicitly define sink behavior (append or upsert) via configuration.
Writer selection logic respects the configured option instead of table type.

Proposal

Example YAML Configuration:

sink:
  type: maxcompute
  name: maxcompute
  access-id: ${secret_values.maxcompute_access_id}
  access-key: ${secret_values.maxcompute_access_key}
  endpoint: ${secret_values.maxcompute_endpoint}
  project: maxcompute_project
  quota.name: res_grp
  sink.operation: append

…Compute options

Copilot

Pull request overview

Adds an explicit MaxCompute sink option to control whether the sink operates in append or upsert mode, decoupling some behavior from MaxCompute table type.

Changes:

Introduce sink.operation configuration (append/upsert) wired through MaxComputeDataSinkFactory into MaxComputeOptions.
Adjust writer selection and table-creation behavior to honor the configured sink operation.
Add an emulator-based E2E test to verify table creation in APPEND mode does not create transactional / PK metadata.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
.../utils/SchemaEvolutionUtilsTest.java	Adds E2E test covering append-mode table creation behavior.
.../EmulatorTestBase.java	Adds `appendOptions` test config using the new sink operation option.
.../writer/MaxComputeWriter.java	Uses `sinkOperation` to decide between upsert vs append writer.
.../writer/BatchAppendWriter.java	Avoids constructing `PartitionSpec` when partition is null/empty.
.../utils/SchemaEvolutionUtils.java	Creates transactional table with PKs only when in UPSERT mode.
.../options/MaxComputeOptions.java	Adds `SinkOperation` option to MaxComputeOptions (+ builder + enum).
.../MaxComputeDataSinkOptions.java	Defines the new `sink.operation` ConfigOption.
.../MaxComputeDataSinkFactory.java	Reads `sink.operation` from config and passes it into options.

Comments suppressed due to low confidence (2)

flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-maxcompute/src/main/java/org/apache/flink/cdc/connectors/maxcompute/utils/SchemaEvolutionUtils.java:94

When sinkOperation is UPSERT but the provided schema has no primary keys, the table will be created as non-transactional and the sink will effectively behave as append. Consider validating this combination and throwing a clear exception (or otherwise enforcing that UPSERT requires primary keys) to avoid silent misconfiguration.

        if (!CollectionUtil.isNullOrEmpty(schema.primaryKeys())
                && options.getSinkOperation() == MaxComputeOptions.SinkOperation.UPSERT) {
            tableCreator
                    .transactionTable()
                    .withBucketNum(options.getBucketsNum())
                    .withPrimaryKeys(schema.primaryKeys());
        }

flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-maxcompute/src/main/java/org/apache/flink/cdc/connectors/maxcompute/writer/MaxComputeWriter.java:43

In UPSERT mode, this falls back to BatchAppendWriter when the target table is non-transactional, which silently ignores update/delete semantics and doesn’t truly “respect the configured option”. Consider failing fast when sinkOperation is UPSERT but isTransactionalTable(...) is false (with an actionable error telling users to create a transactional table or switch to APPEND).

        if (MaxComputeUtils.isTransactionalTable(options, sessionIdentifier)
                && options.getSinkOperation() == MaxComputeOptions.SinkOperation.UPSERT) {
            return new BatchUpsertWriter(options, writeOptions, sessionIdentifier);
        } else {
            return new BatchAppendWriter(options, writeOptions, sessionIdentifier);
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    void testCreateTableInAppendMode() {
+        try {
+            String appendTable = "SCHEMA_EVOLUTION_APPEND_TABLE";
+            SchemaEvolutionUtils.createTable(
+                    appendOptions,
+                    TableId.tableId(appendTable),
+                    Schema.newBuilder()
+                            .physicalColumn("PK", DataTypes.BIGINT())
+                            .physicalColumn("ID1", DataTypes.BIGINT())
+                            .primaryKey("PK")
+                            .build());
+            // In APPEND mode the table should NOT be created as a transactional table,
+            // so primary key metadata should be absent even though the schema defines one.
+            assertThat(odpsInstance.tables().get(appendTable).getPrimaryKey()).isEmpty();
+            odpsInstance.tables().delete(appendTable, true);


lvyanquan · 2026-04-30T02:40:36Z

                    .defaultValue(4)
                    .withDescription("The number of concurrent with flush bucket data.");
+
+    public static final ConfigOption<MaxComputeOptions.SinkOperation> SINK_OPERATION =


Please add documentation for the newly introduced sink.operation option. This should explain:

upsert (default): Requires primary keys in schema. Creates a transactional table and supports update/delete semantics.

append: Creates a regular table regardless of primary keys. Only supports insert operations, suitable for append-only scenarios.

This helps users understand the configuration impact and choose the appropriate mode for their use case.

lvyanquan · 2026-04-30T02:59:22Z

+            return value;
+        }
+
+        public static SinkOperation fromValue(String value) {


This method is unused and could be removed.

lvyanquan · 2026-04-30T03:00:24Z

+            this.value = value;
+        }
+
+        public String getValue() {


This method is unused and could be removed.

[FLINK-39438][MaxCompute] Add support for sink operation types in Max…

fc5c9c6

…Compute options

github-actions Bot added the maxcompute-pipeline-connector label Apr 15, 2026

leonardBang requested a review from Copilot April 15, 2026 06:51

Copilot started reviewing on behalf of leonardBang April 15, 2026 06:52 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

lvyanquan self-assigned this Apr 30, 2026

lvyanquan reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-39438][MaxCompute] Add support for sink operation types in MaxCompute options#4377

[FLINK-39438][MaxCompute] Add support for sink operation types in MaxCompute options#4377
dion-ricky wants to merge 1 commit intoapache:masterfrom
dion-ricky:FLINK-39438-maxcompute-pipeline-sink-operation

dion-ricky commented Apr 15, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

lvyanquan Apr 30, 2026

Uh oh!

lvyanquan Apr 30, 2026

Uh oh!

lvyanquan Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dion-ricky commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Expected Behavior

Proposal

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

lvyanquan Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

lvyanquan Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

lvyanquan Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dion-ricky commented Apr 15, 2026 •

edited

Loading