feat: auto-backup and migrate DuckLake metadata on version upgrade#346
Open
fuziontech wants to merge 3 commits intomainfrom
Open
feat: auto-backup and migrate DuckLake metadata on version upgrade#346fuziontech wants to merge 3 commits intomainfrom
fuziontech wants to merge 3 commits intomainfrom
Conversation
When DuckLake spec version is older than expected (e.g. 0.3 → 0.4), automatically backup all ducklake_* metadata tables to a SQL file before attaching with AUTOMATIC_MIGRATION TRUE. This ensures safe rollback if migration fails. - Add server/ducklake_migration.go with version detection, backup, and shared ATTACH statement builder - Centralize ATTACH statement construction (was duplicated in 3 places) - Backup is written to <dataDir>/ducklake-backup-<timestamp>-v<version>.sql - Migration check runs once per process via sync.Once - Backup failure blocks migration (fail-safe) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Quote all SQL identifiers with double quotes to handle reserved words (e.g. "key", "value" in ducklake_metadata table) - Replace string comparison with numeric version parsing to handle versions like "0.10" correctly - Move migration check before the DuckLake semaphore so backup doesn't block other connections for up to 10 minutes - Add fsync before closing backup file for crash safety - Fix double-close on backup file using closed flag - Add comment explaining sync.Once is correct for multitenant mode (each worker process serves one tenant) - Add comment on []byte assumption in formatSQLValue - Log backup path at INFO before starting (not just after) - Add unit tests for buildDuckLakeAttachStmt, formatSQLValue, quoteIdent, versionLessThan, and duckLakeMigrationNeeded Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Handle return values from rows.Close(), fmt.Fprintf(), fmt.Fprintln(), and dataRows.Close() to satisfy the errcheck linter. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
0.3→0.4), dumps allducklake_*tables to a SQL backup file, then attaches withAUTOMATIC_MIGRATION TRUEserver.go,checkpoint.go,querylog.go)Context
DuckLake 0.4 (shipping with DuckDB 1.5.x) adds
SET SORTED BYsupport which we need for optimizing Parquet row-group pruning on large tables. The 0.3→0.4 migration is irreversible (drops columns, restructures schema_versions), so we need a backup before upgrading.Since
pg_dumpis not installed on duckgres instances, the backup is implemented entirely in Go — connects to the metadata PostgreSQL via pgx (already a dependency) and writes CREATE TABLE + INSERT statements.Test plan
go buildcompiles cleanlygo test ./server/...passesgo vetcleanAUTOMATIC_MIGRATION TRUEsuccessfully upgrades to 0.4 (requires DuckDB 1.5.x driver upgrade in a follow-up)🤖 Generated with Claude Code