Skip to content

Implement SQL CAST in Java filter predicates via SqlRuntimeCast#741

Open
dileepapeiris wants to merge 28 commits intoapache:mainfrom
dileepapeiris:wayang-api-sql/fix-type-casting-issue
Open

Implement SQL CAST in Java filter predicates via SqlRuntimeCast#741
dileepapeiris wants to merge 28 commits intoapache:mainfrom
dileepapeiris:wayang-api-sql/fix-type-casting-issue

Conversation

@dileepapeiris
Copy link
Copy Markdown

@dileepapeiris dileepapeiris commented Apr 1, 2026

Summary

Filter pushdown in wayang-api-sql previously evaluated CAST with a heuristic (Numberdouble, else ensureComparable), which broke valid SQL such as CAST(integer_column AS VARCHAR) (it widened to double instead of producing a string). The private sqlCast method was never invoked and always threw UnsupportedOperationException.

Fixes (#676)

What changed

  • Added SqlRuntimeCast.castValue(Object, SqlTypeName), delegating to Apache Calcite SqlFunctions for boolean, integral, decimal, approximate numeric, and character string targets, with unwraps for NlsString / Character consistent with the rest of the filter code.
  • FilterPredicateImpl now uses the Rex call’s result type (SqlTypeName from CallTreeFactory) for CAST, matching the actual SQL cast target.
  • Removed the dead sqlCast stub from FilterPredicateImpl to avoid duplicate, misleading APIs: behavior lives in one place (SqlRuntimeCast), is testable, and is documented on the class Javadoc.
  • Tests: unit tests in wayang-api-sql, regression CAST(NAMEB AS VARCHAR) = '1' on integer CSV data, plus this demo module for a runnable sample and enum-wide coverage.

Why remove sqlCast in FilterPredicateImpl and added SqlRuntimeCast as separate file?

  • I migrated sqlCast to a standalone SqlRuntimeCast class to establish a unified, public static utility. This centralized approach allows for better reuse within the CAST switch and ensures a single source of truth for supported types.

How to verify

Option A :

./mvnw -pl wayang-api/wayang-api-sql test

Expected Output : -

image

Option B :

For extended testing, I have provided a dedicated repository: https://github.com/dileepapeiris/fix-for-apache-wayang-api-sql-cast-issue . Refer to the README in that repository for detailed execution steps.


Replace the previous ad-hoc CAST handling in FilterPredicateImpl with a call to SqlRuntimeCast.castValue(input.get(0), returnType). This removes the TODO and the previous fallback logic that relied on widenToDouble/ensureComparable, delegating runtime casting to the centralized SqlRuntimeCast implementation for more accurate casting behavior.
Remove the private sqlCast(Object, SqlTypeName) stub from FilterPredicateImpl which threw UnsupportedOperationException and was not used. Cleans up dead code in the SQL conversion functions.
Add a new placeholder source file SqlRuntimeCast.java under wayang-api-sql's calcite converter functions package. The file currently contains only the Apache ASF v2.0 license header; implementation will be added in a follow-up change.
Add package declaration and necessary imports to SqlRuntimeCast.java (BigDecimal, Calendar, Date, Calcite SqlFunctions/SqlTypeName, DateString, NlsString). Prepares the file for implementing runtime cast logic and resolves missing references for date/number/string handling.
Introduce SqlRuntimeCast utility used for runtime SQL CAST evaluation in Wayang Java filters. Adds a private constructor and a castValue(Object, SqlTypeName) method that unwraps inputs, returns null for SQL NULL, and delegates conversions to SqlFunctions for BOOLEAN, numeric types (TINYINT, SMALLINT, INTEGER, BIGINT, DECIMAL), FLOAT/REAL, DOUBLE, and CHAR/VARCHAR. Unsupported target types throw UnsupportedOperationException.
Introduce a private unwrapForCast(Object) method to normalize inputs before runtime casting. It extracts the underlying String from NlsString and converts Character to String, returning the original object otherwise to ensure Cast operations handle Calcite-specific wrappers and char values correctly.
Introduce a private castToFloat(Object) in SqlRuntimeCast that converts DateString, Date, and Calendar to their millisecond instant as a float, falling back to SqlFunctions.toFloat(v). This enables correct runtime casting of date/time values to float in SQL conversions.
Introduce castToDouble in SqlRuntimeCast to convert DateString, java.util.Date, and Calendar values to milliseconds-as-double, falling back to SqlFunctions.toDouble for other types. This ensures runtime casts of date/time inputs to DOUBLE produce consistent epoch-millisecond values.
Introduce a private helper method castToString(Object) in SqlRuntimeCast to support runtime casting to STRING. The new method handles native String and NlsString instances and delegates Boolean, Float and Double conversions to SqlFunctions.toString. This centralizes string casting logic for the SQL calcite converter.
Add runtime-to-string handling for additional value types in SqlRuntimeCast: BigDecimal (via SqlFunctions.toString), general Number, DateString, Character, and java.util.Date. This ensures values of these types are converted to their string representations during SQL runtime casting to avoid unexpected behavior or missing cases.
Add handling for Calendar instances by converting cal.getTime().toString(), and add a generic fallback return String.valueOf(v). This ensures Calendar values and any other runtime objects produce a String representation, improving robustness of SqlRuntimeCast.
Create new test file wayang-api/wayang-api-sql/src/test/java/org/apache/wayang/api/sql/calcite/converter/functions/SqlRuntimeCastTest.java containing the Apache license header. This is a placeholder/skeleton for future unit tests related to SQL runtime cast conversion in the Calcite converter; no test code implemented yet.
Add package declaration and necessary imports to SqlRuntimeCastTest.java, including JUnit assertions and Test, BigDecimal, and Calcite types (SqlTypeName, NlsString). Prepares the file for implementing unit tests for SQL runtime cast conversions.
Introduce SqlRuntimeCastTest#castNullYieldsNull which asserts that SqlRuntimeCast.castValue(null, SqlTypeName.INTEGER) returns null. This verifies that runtime casting preserves null inputs and prevents NPEs during null handling.
Add two unit tests to SqlRuntimeCastTest verifying runtime casting: one ensures an Integer value is cast to VARCHAR (expecting "1"), and the other ensures a numeric String is cast to INTEGER (expecting 42). These tests cover basic int<->varchar conversion behavior.
Add unit tests verifying SqlRuntimeCast.castValue handles string-to-double and NlsString-to-integer conversions. Adds castStringToDouble (asserts "1.5" -> DOUBLE with delta 1e-9) and castNlsStringToInteger (asserts NlsString("7", "UTF-8", null) -> INTEGER).
Add unit test in SqlRuntimeCastTest to verify that casting the string "TRUE" to SqlTypeName.BOOLEAN returns a Boolean instance and evaluates to true. Ensures runtime cast handles string-to-boolean conversion correctly.
Extend SqlRuntimeCastTest with additional unit tests covering error and formatting behavior: verify invalid boolean input throws RuntimeException, ensure BigDecimal -> VARCHAR conversion uses SQL formatting, and assert that casting to DATE is unsupported (throws UnsupportedOperationException).
Add javaFilterWithCastIntColumnToVarchar unit test in SqlToWayangRelTest. The test builds and executes a plan for "SELECT * FROM fs.exampleInt WHERE CAST(NAMEB AS VARCHAR) = '1'", forces the Java platform for execution, and asserts non-empty results and that the filtered NAMEB field equals 1. This verifies casting an INT column to VARCHAR in a WHERE clause works as expected on the Java platform.
Remove the hard-coded switch statement that mapped SQL target types to Java conversion helpers in SqlRuntimeCast. This deletes per-type cases (BOOLEAN, TINYINT, SMALLINT, INTEGER, BIGINT, DECIMAL, FLOAT/REAL, DOUBLE, CHAR/VARCHAR) and the UnsupportedOperationException for unsupported targets — likely part of a refactor to centralize or replace cast handling.
Add a switch on the target SqlTypeName in SqlRuntimeCast to perform runtime conversions using existing SqlFunctions and helper methods. Implements casting for BOOLEAN, TINYINT, SMALLINT, INTEGER, BIGINT, DECIMAL, FLOAT/REAL, DOUBLE and CHAR/VARCHAR, and throws an UnsupportedOperationException for other types. This enables Java-side evaluation of these CAST operations.
Use a traditional instanceof check plus cast for NlsString in unwrapForCast instead of pattern-matching. This avoids relying on newer Java pattern-matching features (or preview syntax) while preserving the original unwrapping behavior.
Modify SqlRuntimeCast to avoid Java pattern-matching instanceof syntax by using traditional instanceof checks and explicit casts. Updated Character, DateString, and Date branches to use classic casts (and o.toString()) to improve compatibility with Java versions that don't support the instanceof variable form; no functional behavior changes.
Replace pattern-matching instanceof usages with explicit casts in SqlRuntimeCast.java (for DateString, Date and Calendar). This is a syntactic change to avoid newer pattern-matching syntax and improve compatibility/consistency; runtime behavior is unchanged.
Replace Java pattern-matching instanceof usages in SqlRuntimeCast with traditional instanceof checks and explicit casts (e.g. String, NlsString, Boolean cases). This removes preview/pattern-matching syntax and improves compatibility with Java versions that do not support pattern-matching; behavior remains unchanged.
Replace pattern-matching instanceof usages that declared pattern variables (e.g. 'if (v instanceof final Float f)') with traditional instanceof checks and explicit casts for Float, Double and BigDecimal in SqlRuntimeCast.java. No behavioral changes — this improves compatibility/readability by avoiding the newer pattern-variable syntax.
Replace Java pattern-matching instanceof (e.g., 'if (v instanceof final Number n)') with classic instanceof checks and explicit casts or direct v.toString() calls for DateString and Character in SqlRuntimeCast.java. This simplifies the code and improves compatibility with Java versions that do not support pattern-matching instanceof.
Replace pattern-matching instanceof uses for Date and Calendar with traditional instanceof checks and explicit casts. The Date binding (previously 'instanceof final Date d') was removed in favor of v.toString(), and the Calendar case now casts to Calendar before calling getTime().toString(). This simplifies the code and improves compatibility with Java versions/compilers that don't support pattern-variable bindings.
@dileepapeiris dileepapeiris marked this pull request as ready for review April 1, 2026 19:09
@dileepapeiris dileepapeiris changed the title Implement SQL CAST in Java filter predicates via SqlRuntimeCast (fixes #676) Implement SQL CAST in Java filter predicates via SqlRuntimeCast Apr 1, 2026
@dileepapeiris
Copy link
Copy Markdown
Author

dileepapeiris commented Apr 1, 2026

Hi , I am GSOC applicant for (WAYANG-54) : Make Wayang more datalake-friendly .


In here , I have resolved this issue by refactoring the casting logic into a more organized and readable standalone utility.

For extended testing, I have provided a dedicated repository: https://github.com/dileepapeiris/fix-for-apache-wayang-api-sql-cast-issue . Refer to the README in that repository for detailed execution steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant