Skip to content

[GLUTEN-11570][VL] Wire spark.sql.binaryOutputStyle to Velox to_pretty_string#12324

Open
n0r0shi wants to merge 1 commit into
apache:mainfrom
n0r0shi:gluten-11570-pretty-string-binary
Open

[GLUTEN-11570][VL] Wire spark.sql.binaryOutputStyle to Velox to_pretty_string#12324
n0r0shi wants to merge 1 commit into
apache:mainfrom
n0r0shi:gluten-11570-pretty-string-binary

Conversation

@n0r0shi

@n0r0shi n0r0shi commented Jun 20, 2026

Copy link
Copy Markdown

What changes were proposed in this pull request?

Spark 4.0+ introduced spark.sql.binaryOutputStyle to control how binary values are rendered. Velox's to_pretty_string previously only supported the default HEX_DISCRETE style. This change wires the Spark config through Gluten to Velox so all 5 styles (HEX_DISCRETE, HEX, BASE64, UTF-8, BASIC) are honored.

  • Add kSparkBinaryOutputStyle = "spark.sql.binaryOutputStyle" constant.
  • Forward it via setIfExists in WholeStageResultIterator, mapping to
    Velox's SparkQueryConfig::kBinaryOutputStyle (binary_output_style).
  • Expand GlutenDataFrameSuite.getRows: binary in spark40 and spark41 to
    cover all 5 styles, mirroring the upstream Spark DataFrameSuite test.

Depends on

Velox PR: facebookincubator/velox#17884
(adds SparkQueryConfig::kBinaryOutputStyle + 5 styles in ToPrettyStringVarbinaryFunction).

Once that lands, this PR's behavior fully kicks in. With only this Gluten PR (and the Velox PR not yet merged), the existing HEX_DISCRETE behavior is preserved, but HEX/BASE64/UTF-8/BASIC will not take effect.

How was this patch tested?

GlutenDataFrameSuite.getRows: binary in spark40 and spark41; this test covers all 5 styles using withSQLConf(SQLConf.BINARY_OUTPUT_STYLE.key -> ...), mirroring the upstream Spark DataFrameSuite.getRows: binary test.

Fixes #11570

@github-actions github-actions Bot added CORE works for Gluten Core VELOX labels Jun 20, 2026
…y_string

Spark 4.0+ introduced spark.sql.binaryOutputStyle to control how binary
values are rendered. The Velox-side support landed in
facebookincubator/velox#17884 (binary_output_style config + 5 styles).

This change wires the Spark config through to Velox so to_pretty_string
honors HEX_DISCRETE (default), HEX, BASE64, UTF-8, and BASIC.

- Add kSparkBinaryOutputStyle constant.
- Forward it via setIfExists in WholeStageResultIterator.
- Expand GlutenDataFrameSuite.getRows: binary in spark40 and spark41 to
  cover all 5 styles, mirroring the upstream Spark test.

Fixes apache#11570
@n0r0shi n0r0shi force-pushed the gluten-11570-pretty-string-binary branch from 51923dc to 0d20d77 Compare June 20, 2026 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[VL] GlutenDataFrameSuite support ToPrettyString binary formats

1 participant