Skip to content

[Bug] CAST(complex AS STRING) does not honour spark.sql.legacy.castComplexTypesToString.enabled #4492

@andygrove

Description

@andygrove

Describe the bug

Spark's spark.sql.legacy.castComplexTypesToString.enabled configuration switches the format used by CAST(<struct|array|map> AS STRING):

  • Default (false): structs render as {a, b, c}, arrays as [1, 2, 3], nulls as null.
  • Legacy (true): structs render as [a, b, c], arrays as [1, 2, 3], nulls render as "" (empty string).

Comet's native struct/array-to-string formatter hard-codes the default-mode brackets and uses cast_options.null_string = "null". The legacy_cast_complex_to_string flag is not currently plumbed through the cast proto, so users who set spark.sql.legacy.castComplexTypesToString.enabled=true will see Comet output differ from Spark for CAST(<complex> AS STRING).

Surfaced by the cast audit (collection PR queue).

Steps to reproduce

SET spark.sql.legacy.castComplexTypesToString.enabled = true;
SELECT CAST(struct(1, 2, null) AS STRING);
-- Spark legacy: [1,2,]
-- Comet:        {1, 2, null}

Expected behavior

Either:

  1. Plumb the conf through the Cast proto and switch the native formatter on legacy_cast_complex_to_string, OR
  2. Downgrade (StructType|ArrayType|MapType, StringType) casts to Incompatible(Some(...)) when the conf is enabled.

Additional context

  • Native impl: native/spark-expr/src/conversion_funcs/cast.rs (struct/array string formatters)
  • Comet matrix: CometCast.canCastToString
  • Spark conf: SQLConf.LEGACY_COMPLEX_TYPES_TO_STRING

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions