Skip to content

feat(duckdb): Add UUID v5 unsupported error for Snowflake UUID_STRING transpilation#7518

Open
fivetran-ashashankar wants to merge 9 commits intomainfrom
RD-1147774-transpile_UUID_STRING
Open

feat(duckdb): Add UUID v5 unsupported error for Snowflake UUID_STRING transpilation#7518
fivetran-ashashankar wants to merge 9 commits intomainfrom
RD-1147774-transpile_UUID_STRING

Conversation

@fivetran-ashashankar
Copy link
Copy Markdown
Collaborator

Summary

Added support for Snowflake UUID_STRING function transpilation to DuckDB:

  • UUID_STRING() (v4) → UUID() ✓
  • UUID_STRING(namespace, name) (v5) → UnsupportedError (DuckDB doesn't support deterministic UUID v5)
  • GENERATE_UUID() (BigQuery) → CAST(UUID() AS TEXT) when is_string flag set

Changes

  • sqlglot/generators/duckdb.py: Added uuid_sql() method
  • tests/dialects/test_snowflake.py: Updated existing test to expect UnsupportedError for UUID v5
  • Integration tests: Added roundtrip tests for UUID transpilation

Test Results

✓ All 1227 unit tests pass
✓ Integration tests pass

Related: RD-1147774

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 18, 2026

SQLGlot Integration Test Results

Comparing:

  • this branch (sqlglot:RD-1147774-transpile_UUID_STRING, sqlglot version: RD-1147774-transpile_UUID_STRING)
  • baseline (main, sqlglot version: 0.0.1.dev1)

By Dialect

dialect main sqlglot:RD-1147774-transpile_UUID_STRING transitions links
bigquery -> bigquery 24645/24650 passed (100.0%) 23491/23491 passed (100.0%) No change full result / delta
bigquery -> duckdb 867/1154 passed (75.1%) 0/0 passed (0.0%) Results not found full result / delta
duckdb -> duckdb 5823/5823 passed (100.0%) 5823/5823 passed (100.0%) No change full result / delta
snowflake -> duckdb 1063/1961 passed (54.2%) 1063/1961 passed (54.2%) No change full result / delta
snowflake -> snowflake 65133/65133 passed (100.0%) 65133/65133 passed (100.0%) No change full result / delta
databricks -> databricks 1370/1370 passed (100.0%) 1370/1370 passed (100.0%) No change full result / delta
postgres -> postgres 6042/6042 passed (100.0%) 6042/6042 passed (100.0%) No change full result / delta
redshift -> redshift 7101/7101 passed (100.0%) 7101/7101 passed (100.0%) No change full result / delta

Overall

main: 113234 total, 112044 passed (pass rate: 98.9%), sqlglot version: 0.0.1.dev1

sqlglot:RD-1147774-transpile_UUID_STRING: 110921 total, 110023 passed (pass rate: 99.2%), sqlglot version: RD-1147774-transpile_UUID_STRING

Transitions:
No change

Dialect pair changes: 0 previous results not found, 1 current results not found

⚠️ 1 test failure(s) (view logs)

Comment thread sqlglot/generators/duckdb.py Outdated
@georgesittas
Copy link
Copy Markdown
Collaborator

@fivetran-ashashankar have you looked into the two Snowflake -> DuckDB failures reported here?

@georgesittas
Copy link
Copy Markdown
Collaborator

I think we can probably transpile UUID5 to DuckDB manually instead of falling back to unsupported.

@fivetran-ashashankar
Copy link
Copy Markdown
Collaborator Author

I think we can probably transpile UUID5 to DuckDB manually instead of falling back to unsupported.
yes. implmemted.

Comment thread sqlglot/generators/duckdb.py Outdated
Comment thread sqlglot/generators/duckdb.py Outdated
Comment thread tests/dialects/test_snowflake.py Outdated
Copy link
Copy Markdown
Collaborator

@georgesittas georgesittas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much cleaner.

@fivetran-ashashankar let's rebase here and it should be good to afterwards.

fivetran-ashashankar and others added 6 commits April 22, 2026 09:19
Review comments implement uuid_string

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Review comments implement uuid_string

The CAST to VARCHAR is unnecessary as LOWER() and string concatenation
already return VARCHAR type in DuckDB.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@fivetran-ashashankar fivetran-ashashankar force-pushed the RD-1147774-transpile_UUID_STRING branch from 5988b60 to 7ea8dbf Compare April 22, 2026 16:25
- Set UUID_IS_STRING_TYPE = True in Snowflake dialect
- Add UUID_STRING parser that sets is_string=True flag
- Explicitly set UUID_IS_STRING_TYPE = False in Hive, Presto, and DuckDB dialects
SUPPORTS_FIXED_SIZE_ARRAYS = True
STRICT_JSON_PATH_SYNTAX = False
NUMBERS_CAN_BE_UNDERSCORE_SEPARATED = True
UUID_IS_STRING_TYPE = False
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant, right? The default is False, so DuckDB and other dialects will inherit it, unless set to True.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was trying to debug to see what is causing the test failure.

Comment thread sqlglot/dialects/hive.py
ARRAY_AGG_INCLUDES_NULLS = None
REGEXP_EXTRACT_DEFAULT_GROUP = 1
ALTER_TABLE_SUPPORTS_CASCADE = True
UUID_IS_STRING_TYPE = False
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't right. We need to set this to True. Did you test against Hive, Spark, Databricks?

Hive:

0: jdbc:hive2://localhost:10000> select uuid(), typeof(uuid());
INFO  : Compiling command(queryId=hive_20260423131737_e6c31a91-6039-440c-a7d4-dde96388d441): select uuid(), typeof(uuid())
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:string, comment:null), FieldSchema(name:_c1, type:string, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20260423131737_e6c31a91-6039-440c-a7d4-dde96388d441); Time taken: 1.256 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing command(queryId=hive_20260423131737_e6c31a91-6039-440c-a7d4-dde96388d441): select uuid(), typeof(uuid())
INFO  : Completed executing command(queryId=hive_20260423131737_e6c31a91-6039-440c-a7d4-dde96388d441); Time taken: 0.001 seconds
+---------------------------------------+---------+
|                  _c0                  |   _c1   |
+---------------------------------------+---------+
| cb49b6c9-f166-45b4-9913-f9efe285666f  | string  |
+---------------------------------------+---------+
1 row selected (1.372 seconds)

Spark:

(sqlglot) ➜  sqlglot git:(main) runspk "SELECT typeof(UUID())"
WARNING: Using incubator modules: jdk.incubator.vector
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
26/04/23 16:18:47 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
26/04/23 16:18:49 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0
26/04/23 16:18:49 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore georgesittas@192.168.1.234
Spark Web UI available at http://localhost:4040
Spark master: local[*], Application Id: local-1776950328490
string
Time taken: 0.888 seconds, Fetched 1 row(s)

LOG_BASE_FIRST: bool | None = None
SUPPORTS_VALUES_DEFAULT = False
LEAST_GREATEST_IGNORES_NULLS = False
UUID_IS_STRING_TYPE = False
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct but redundant.

),
"SYSTIMESTAMP": exp.CurrentTimestamp.from_arg_list,
"UNICODE": lambda args: exp.Unicode(this=seq_get(args, 0), empty_is_zero=True),
"UUID_STRING": lambda args: exp.Uuid(this=seq_get(args, 0), name=seq_get(args, 1)),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants