[AURON #2155] Date-part extraction functions missing timezone handling for Timestamp inputs #2156
Open
ShreyeshArangath wants to merge 6 commits intoapache:masterfrom
Open
[AURON #2155] Date-part extraction functions missing timezone handling for Timestamp inputs #2156ShreyeshArangath wants to merge 6 commits intoapache:masterfrom
ShreyeshArangath wants to merge 6 commits intoapache:masterfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes incorrect date-part extraction for Timestamp inputs in non-UTC session timezones by ensuring the session timezone is passed from Spark to the native Rust implementations and applied before extracting date components.
Changes:
- Switch Spark expression conversion for
year/month/dayofmonth/dayofweek/quarterto usebuildTimePartExt, which passesSQLConf.sessionLocalTimeZoneforTimestampType. - Update native Rust implementations (
Spark_Year,Spark_Month,Spark_Day,Spark_DayOfWeek,Spark_Quarter) to interpret timestamp inputs in the provided timezone by converting to a localDate32prior to extraction. - Add Spark-level and Rust-level unit tests to cover timezone-sensitive boundary cases and ensure date inputs remain timezone-invariant.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| spark-extension/src/main/scala/org/apache/spark/sql/auron/NativeConverters.scala | Routes more date-part expressions through the timezone-aware ext-function builder so session timezone is provided for timestamps. |
| spark-extension-shims-spark/src/test/scala/org/apache/auron/AuronFunctionSuite.scala | Adds Spark integration tests intended to validate correct behavior under non-UTC timezones. |
| native-engine/datafusion-ext-functions/src/spark_dates.rs | Implements timezone-aware timestamp→local-date conversion for date-part extraction and adds native unit tests for boundary-crossing scenarios. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
spark-extension-shims-spark/src/test/scala/org/apache/auron/AuronFunctionSuite.scala
Outdated
Show resolved
Hide resolved
- Remove unused chrono::prelude::* import in spark_dates.rs - Fix timezone test to insert under UTC and query under America/New_York so the test actually exercises the boundary-crossing bug
The previous commit removed chrono::prelude::* but the Offset trait is needed for calling .fix() on TzOffset.
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #2155
Rationale for this change
Five date-part extraction functions in NativeConverters.scala use buildExtScalarFunction, which does not pass the session timezone to the native Rust implementation:
By contrast, Hour, Minute, Second, and WeekOfYear correctly use buildTimePartExt, which passes sessionLocalTimeZone for TimestampType inputs.
This inconsistency can cause incorrect results for timestamp inputs near date boundaries in non-UTC timezones.
Affected functions:
What changes are included in this PR?
This PR fixes the bug described above
Are there any user-facing changes?
Correctness issues fixed
How was this patch tested?
Unit tests