Skip to content

fix(duckdb): add ParseDatetime transpilation for BigQuery to DuckDB#7704

Open
william-goode wants to merge 5 commits into
tobymao:mainfrom
william-goode:fix/duckdb-parse-datetime-transpilation
Open

fix(duckdb): add ParseDatetime transpilation for BigQuery to DuckDB#7704
william-goode wants to merge 5 commits into
tobymao:mainfrom
william-goode:fix/duckdb-parse-datetime-transpilation

Conversation

@william-goode

Copy link
Copy Markdown
Contributor

Summary

BigQuery's PARSE_DATETIME is parsed into exp.ParseDatetime but DuckDB's generator has no handler. Falls through to function_fallback_sql and emits PARSE_DATETIME(...) verbatim — DuckDB rejects this.

Fix: Add parsedatetime_sql to DuckDBGenerator mapping to STRPTIME. No CAST needed (unlike ParseTime/StrToDate) since STRPTIME already returns TIMESTAMP. No TRY_STRPTIME conditional since ParseDatetime has no safe arg.

Reproduction

import sqlglot

# Before: falls through
sqlglot.transpile("SELECT PARSE_DATETIME('%Y-%m-%d %H:%M:%S', '2023-01-15 14:30:00')", read="bigquery", write="duckdb")
# => ["SELECT PARSE_DATETIME('2023-01-15 14:30:00', '%Y-%m-%d %H:%M:%S')"]

# After
# => ["SELECT STRPTIME('2023-01-15 14:30:00', '%Y-%m-%d %H:%M:%S')"]

Test plan

  • Three validate_all tests: standard datetime format, named day/month with 12-hour clock, microsecond precision (%E6S)
  • make unit: 1219 passed, 0 failures
  • BigQuery and DuckDB dialect tests green

BigQuery's PARSE_DATETIME is parsed into exp.ParseDatetime but DuckDB's
generator has no handler, so it falls through to function_fallback_sql
and emits PARSE_DATETIME(...) verbatim, which DuckDB rejects.

Add parsedatetime_sql method that maps to STRPTIME. Unlike ParseTime
(which needs CAST to TIME) or StrToDate (which needs CAST to DATE),
STRPTIME already returns TIMESTAMP which matches PARSE_DATETIME
semantics, so no cast is needed. ParseDatetime has no safe arg, so
no TRY_STRPTIME conditional is required.

[CLAUDE]
Cover three format variations for BigQuery PARSE_DATETIME -> DuckDB
STRPTIME to avoid a second review round requesting additional coverage:
- Standard datetime format (%Y-%m-%d %H:%M:%S)
- Named day/month with 12-hour time (%a %b %e %I:%M:%S %Y)
- Microsecond precision (%H:%M:%E6S)

[CLAUDE]
Comment thread tests/dialects/test_duckdb.py
Reviewer flagged that time-only PARSE_DATETIME produces different
default dates between engines (BQ: 1970-01-01, DuckDB: 1900-01-01).
Specify the date explicitly to avoid the degenerate case.

[CLAUDE]
@geooo109

geooo109 commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

@william-goode let's move on with the concatenation for the solution.

So let's do the following:

  1. Add a new flag on the AST node and populate it from the bigquery side (parsing) with the default year 1970
  2. On the duckdb side (generation) apply the concatenation, it should look like this: (you should apply this only if the default year is populated, otherwise we should follow the default roundtrip that we already have)
STRPTIME('1970 ' <- this is set by the bq parser || <value>, '%Y ' || <format>)

@georgesittas

Copy link
Copy Markdown
Collaborator

Hey @william-goode, any plans to take this to the finish line?

@william-goode

Copy link
Copy Markdown
Contributor Author

@georgesittas Yes! Sorry, got caught up with work.

@geooo109 Yes chef.

Thanks for the feedback fellas - changes coming shortly.

@georgesittas

Copy link
Copy Markdown
Collaborator

Cool, no worries & thanks for following up!

BigQuery defaults to 1970 for missing year components in
PARSE_DATETIME, while DuckDB defaults to 1900. Add a default_year
flag to ParseDatetime AST node, set by the BigQuery parser, and
generate STRPTIME('1970 ' || <value>, '%Y ' || <format>) in DuckDB
to match BigQuery semantics. [CLAUDE]
@william-goode

Copy link
Copy Markdown
Contributor Author

Patched with concatenation approach and ready for review.

A knock-on effect:

The BigQuery parser sets default_year = True at build-time, but ClickHouse also lowers its PARSEDATETIME into an exp.ParseDatetime node. Previously, this would emit PARSE_DATETIME, which DuckDB rejects. So this should be a strict improvement over main. It now routes to the new DuckDB handler. The ClickHouse builder does not set default_year = True and the 1970 prepend is gated on this flag, so no 1970 is applied to ClickHouse nodes. If ClickHouse happens to also need a default year of 1970, that could be added. I'm not a regular user of ClickHouse so I'll refrain from guessing at the answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants