-
Notifications
You must be signed in to change notification settings - Fork 224
[AURON #2229] bugfix Silent fallback to empty partitions may cause data loss #2230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
12d3b91
d90aae5
b22d938
b21d100
673d101
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -78,7 +78,14 @@ object IcebergScanSupport extends Logging { | |
| return None | ||
| } | ||
|
|
||
| val partitions = inputPartitions(exec) | ||
| val partitions = | ||
| try { | ||
| inputPartitions(exec) | ||
| } catch { | ||
| case e: Throwable => | ||
| logWarning(s"Get Partition error: ${e.getMessage}") | ||
| return None | ||
| } | ||
| // Empty scan (e.g. empty table) should still build a plan to return no rows. | ||
| if (partitions.isEmpty) { | ||
| logWarning(s"Native Iceberg scan planned with empty partitions for $scanClassName.") | ||
|
|
@@ -190,7 +197,9 @@ object IcebergScanSupport extends Logging { | |
| logWarning( | ||
| s"Failed to obtain input partitions via reflection for ${exec.getClass.getName}.", | ||
| t) | ||
|
Comment on lines
197
to
199
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This catch now throws on a reflective failure, which is what routes |
||
| Seq.empty | ||
| throw new IllegalStateException( | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since we are throwing an exception anyways, do we need to catch it in L189? |
||
| s"Cannot resolve input partitions for ${exec.getClass.getName}", | ||
| t) | ||
| } | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This
catchconverts theIllegalStateExceptionthrown at L200 intoreturn None, so on a planning failure the effective behavior is a fallback to Spark's scan path (plus two WARN logs) rather than the "fail fast with a clear exception" the description mentions — no exception reaches the caller. Falling back to Spark is arguably the safer outcome, since the query still returns correct results, so this may well be what you want — worth confirming it's intentional rather than a hard failure.If Spark-fallback is the intent, the throw-at-L200-then-catch-here round-trip is using an exception purely for control flow. Would signaling failure through the return type read more directly — e.g.
inputPartitionsreturningOption[Seq[InputPartition]], withNone= planning failed andSome(empty)= genuinely empty table, andplan()matching on it? That also answers the question on the other thread about whether this catch is removable: it isn't — dropping it would route a failure back into the empty-partition branch and reintroduce the empty-plan case.Minor: this catches
Throwable, while the sibling catch at L196 usesNonFatal. MatchingNonFatalhere would avoid folding fatal errors likeOutOfMemoryErrorinto a silent Spark fallback.