Skip to content

[AURON #2325] Fix Auron Kafka source emitting no records without a watermark#2326

Open
weiqingy wants to merge 1 commit into
apache:masterfrom
weiqingy:AURON-2325-impl
Open

[AURON #2325] Fix Auron Kafka source emitting no records without a watermark#2326
weiqingy wants to merge 1 commit into
apache:masterfrom
weiqingy:AURON-2325-impl

Conversation

@weiqingy

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #2325

Rationale for this change

AuronKafkaSourceFunction set its isRunning flag to true only inside the if (watermarkStrategy != null) branch of open(), but run() collects records only while isRunning is true on both the watermark and no-watermark paths. A source configured without an event-time watermark therefore never started emitting and returned an empty result set. The gap was never caught because the existing auron-kafka test tables all declare a watermark.

What changes are included in this PR?

isRunning is now set to true unconditionally once open() completes, so a no-watermark source emits records (and snapshots offsets and discovers partitions) like a watermarked one, while a partial-initialization failure still leaves the source not-running.

A no-watermark mock table and an end-to-end test asserting the source emits its records are added.

Are there any user-facing changes?

Yes. An auron-kafka table without a WATERMARK FOR clause now returns its records instead of an empty result.

How was this patch tested?

A new AuronKafkaNoWatermarkITCase runs a plain SELECT over a no-watermark auron-kafka table and asserts the emitted records; it fails (empty result) without the fix and passes with it. The full auron-flink-runtime and auron-flink-planner module test suites pass, including the existing watermarked-source integration tests.

…t a watermark

AuronKafkaSourceFunction.open() set isRunning=true only inside the
watermarkStrategy != null branch, but run() collects records only while
isRunning is true on both the watermark and no-watermark paths. A source
configured without an event-time watermark therefore emitted nothing. Set
isRunning=true unconditionally once open() completes so no-watermark sources
emit records (and snapshot offsets and discover partitions) like watermarked
ones, while a partial-init failure still leaves the source not-running.

Add a no-watermark mock table and an end-to-end test asserting the source
emits its records.
@github-actions github-actions Bot added the flink label Jun 11, 2026
@weiqingy

Copy link
Copy Markdown
Contributor Author

Hi @Tartarus0zm, could you please help review this PR when you get a chance? Thanks!

@weiqingy

Copy link
Copy Markdown
Contributor Author

The 2 failing checks are unrelated to this PR — it only touches Flink extension files (AuronKafkaSourceFunction + two Flink Kafka tests), no Spark code:

  • Test spark-4.0 JDK21 Scala-2.13 / Build Auron JAR and Test spark-4.1 JDK21 Scala-2.13 / Build Auron JAR: scala-test-compile fails in the auron-spark-tests/spark4{0,1} modules — the test scaffolding can't resolve Spark's internal Catalyst cast suites (CastSuite, AnsiCastSuiteWithAnsiModeOff, AnsiCastSuiteWithAnsiModeOn, CastSuiteWithAnsiModeOn): not found: type CastSuite … four errors found → BUILD FAILURE. Evidence: spark-4.0 job, spark-4.1 job.
  • This is a pre-existing Spark-4.0/4.1 test-module breakage on master, not introduced here: the latest master TPC-DS run fails the same two jobs with the identical errors. Evidence: master run 27246405850.
  • The lane that actually exercises this PR's code, Test Flink 1.18, passed.

This looks like a known master breakage on the Spark 4.0/4.1 lanes rather than anything in this PR. Could you please help re-trigger once master is green there? Thanks!

@weiqingy

Copy link
Copy Markdown
Contributor Author

The 2 failing checks are unrelated to this PR. This PR only touches auron-flink-extension (Flink Kafka source + Calc fusion), which isn't part of the Spark reactor, and both failures are pre-existing on master.

  • Test spark-4.0 JDK21 Scala-2.13 / Build Auron JAR: Scala test-compile failure in auron-spark-tests-spark40not found: type CastSuite / AnsiCastSuiteWithAnsiModeOff / AnsiCastSuiteWithAnsiModeOn / CastSuiteWithAnsiModeOn. log
  • Test spark-4.1 JDK21 Scala-2.13 / Build Auron JAR: Same test-compile failure in auron-spark-tests-spark41. log

These cast test wrappers were added by [AURON #2174] and reference upstream Spark cast base suites that don't resolve under Spark 4.0/4.1. The same two jobs fail on master at the parent commit 44c77c2e (this run), so a re-trigger won't clear them — the fix belongs on master (update/exclude the Spark 4.0/4.1 cast suites), not in this PR.

Everything else (Flink 1.18, all Spark 3.x lanes, Rust, Style, License) is green.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a regression in the Auron Flink Kafka source where tables configured without an event-time watermark strategy would never emit records because isRunning was only set in the watermark-initialization branch of open().

Changes:

  • Set AuronKafkaSourceFunction.isRunning = true unconditionally after successful open() initialization so both watermark and no-watermark run paths can emit.
  • Add a no-watermark Kafka mock table (T5) to the shared Kafka test base.
  • Add an end-to-end regression IT case that validates a no-watermark source emits all expected rows.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
auron-flink-extension/auron-flink-runtime/src/main/java/org/apache/auron/flink/connector/kafka/AuronKafkaSourceFunction.java Ensures the source transitions to “running” after initialization regardless of watermark strategy presence.
auron-flink-extension/auron-flink-planner/src/test/java/org/apache/auron/flink/table/kafka/AuronKafkaSourceTestBase.java Adds a no-watermark mock table definition (T5) to exercise the non-watermark execution path.
auron-flink-extension/auron-flink-planner/src/test/java/org/apache/auron/flink/table/kafka/AuronKafkaNoWatermarkITCase.java Introduces a regression IT validating record emission for no-watermark Kafka tables.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Tartarus0zm

Copy link
Copy Markdown
Contributor

hi @weiqingy thanks for your contribution. but CI is fail with spark test. we need wait for #2324 merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Flink] Auron Kafka source emits no records when no watermark strategy is configured

3 participants