feat(tidb): catalog TiDB extensions — fields, wiring, completion, tests (PR3b)#100
Merged
vsai12 merged 6 commits intofeat/tidb-catalog-scaffoldfrom Apr 21, 2026
Merged
Conversation
PR3b core: adds TiDB-specific table/column/constraint metadata and wires the parser's TiDB options through to the catalog. Collation: - tidb/catalog/catalog.go New() default collation → utf8mb4_bin (TiDB v8.5 default; MySQL 8.0 uses utf8mb4_0900_ai_ci) - Hardcoded, no setter — YAGNI until a customer requests configurability Schema extensions: - Table: ShardRowIDBits, PreSplitRegions, AutoIDCache, AutoRandomBase, PlacementPolicy, TTLColumn/Interval/Enable/JobInterval, TiFlashReplica - Column: AutoRandom, AutoRandomShardBits, AutoRandomRangeBits - Constraint: Clustered *bool (nil=unset, true=CLUSTERED, false=NONCLUSTERED) Option wiring (tablecmds.go + altercmds.go): - CREATE TABLE + ALTER TABLE extract shard_row_id_bits, pre_split_regions, auto_id_cache, auto_random_base, placement policy, ttl, ttl_enable, ttl_job_interval - AUTO_RANDOM attribute populates column fields - CLUSTERED/NONCLUSTERED on table-level PK AND inline-PK path (hoisted from ColumnConstraint.Clustered onto synthesized Constraint.Clustered) - ALTER TABLE SET TIFLASH REPLICA sets Table.TiFlashReplica - ALTER TABLE REMOVE TTL clears all TTL fields TTL extraction: - Added public parser.ParseExpr for re-parsing the raw TTL expression string stored by PR2 on TableOption.Value - extractTTLParts walks the AST to extract (TTLColumn, TTLInterval), handling `<col> + INTERVAL <n> <unit>` and `DATE_ADD(<col>, INTERVAL <n> <unit>)` shapes; unrecognized shapes return a validation error rather than silently truncating Walkthrough tests (wt_tidb_1_test.go): - 1.1 AUTO_RANDOM + SHARD_ROW_ID_BITS - 1.2 PLACEMENT POLICY (table) - 1.3 TTL + TTL_ENABLE - 1.4 ALTER TABLE SET TIFLASH REPLICA - 1.5 ALTER TABLE REMOVE TTL (clears all TTL fields) - 1.6 table-level CLUSTERED PK - 1.6b inline CLUSTERED PK (hoist path — distinct code path) - 1.7 multi-feature interaction (catches cross-option state leakage) - 1.8 TTL with DATE_ADD shape (regression coverage for PR2's comma bug) All tests pass. go vet clean. MySQL-inherited tests still pass except for four SHOW-CREATE-TABLE-fidelity tests that assert literal utf8mb4_0900_ai_ci; those are annotated in a follow-up commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ences After flipping the catalog's default collation from utf8mb4_0900_ai_ci to utf8mb4_bin (TiDB's v8.5 default), four SHOW-CREATE-TABLE fidelity tests now fail because they assert exact MySQL-8.0 output. The behavioral difference is expected, not a bug in the fork. Annotated with t.Skip and a concrete reason: - show_test.go/TestShowCreateTableBasic — literal "utf8mb4_0900_ai_ci" - show_test.go/TestShowCreateTableDefaults — "varchar(50) DEFAULT ..." now shows per-column COLLATE utf8mb4_bin - show_test.go/TestShowCreateTableNotNullNoDefault — same root cause - wt_13_2_test.go (two subtests) — same root cause Future: a TiDB-specific SHOW CREATE TABLE fidelity test belongs in wt_tidb_*_test.go, asserting against TiDB's actual output format. Pre-triage (utf8mb4_0900_ai_ci + DEFINER/SQL SECURITY/ALGORITHM/TRIGGER/ FULLTEXT/SPATIAL) flagged far more files than actually broke — MySQL-only DDL tests all still pass on TiDB because the TiDB parser accepts MySQL syntax and the catalog handles it identically. Only the collation- dependent SHOW-CREATE fidelity assertions were sensitive to the flip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the parser's completion-mode keyword emission so TiDB-specific options surface in the candidate list at the correct grammar positions. No new rule-to-resolver plumbing is needed — the keywords were already in the lexer (PR2), they just weren't being offered by the completion walker. Grammar positions updated in tidb/parser: - create_table.go parseTableOption() now emits SHARD_ROW_ID_BITS, PRE_SPLIT_REGIONS, AUTO_ID_CACHE, AUTO_RANDOM_BASE, TTL, TTL_ENABLE, TTL_JOB_INTERVAL, PLACEMENT alongside MySQL's ENGINE/DEFAULT/etc. - alter_table.go (two positions: ALTER TABLE command-list after the table name, and parseAlterTableCmd) now emits SET and REMOVE to anchor SET TIFLASH REPLICA and REMOVE TTL. Deliberately NOT added: - TiKV is NOT added to the ENGINE= candidate list. TiDB parses ENGINE= for MySQL compatibility but ignores the value; suggesting TiKV would mislead users. A pinned negative test enforces this. - AUTO_RANDOM is not added to the column_constraint candidate list. The existing column-constraint path surfaces AUTO_INCREMENT alongside the full set of column constraint tokens; AUTO_RANDOM would need a separate completion anchor and there's no column_constraint-level completion call site in the fork. Revisit if users report missing AUTO_RANDOM completion. - CLUSTERED/NONCLUSTERED on PK — no PK-modifier completion anchor exists. Same deferral reasoning. - charset/collation lists are untouched. MySQL's list is a superset of TiDB's practical set; removing entries risks breaking valid MySQL- compat DDL that TiDB accepts. Tests (tidb/completion/tidb_keywords_test.go): - TestTiDBKeywords_TableOptions — at least one TiDB table option appears after a CREATE TABLE column list - TestTiDBKeywords_AlterTable — SET or REMOVE appears in ALTER TABLE command position - TestTiDBKeywords_NoTiKVEngine — pinned negative test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cross-validates that omni's catalog and a real TiDB instance accept (or reject) the same TiDB-specific DDL in lockstep. Per-case flow: 1. Run optional setup SQL on TiDB. 2. Run the test SQL on TiDB — if TiDB rejects, fail. 3. Run setup+test on a fresh omni Catalog via Exec() — if catalog rejects, fail. 4. Clean up on TiDB. Lockstep over information_schema comparison: deep property checks belong in a future diff/migration PR. This test's goal is to catch drift between parser/catalog acceptance and real TiDB acceptance — which is exactly what the first run caught: three of the original cases combined options that TiDB actually rejects. Cases covered: - auto_random on PK - shard_row_id_bits (no explicit PK — TiDB rejects combo w/ PK-as-row-id) - ttl create - tiflash replica 0 (replica>0 requires TiFlash servers, unhermetic) - remove ttl alter - clustered pk - multi-feature interaction Deferred: - CREATE PLACEMENT POLICY DDL — catalog does not model policies as first-class objects yet; the test skips when catalog setup fails on unmodeled DDL rather than hard-coding a behavior difference. Skipped in `-short` mode. Uses the same sync.Once testcontainers pattern as tidb/parser/tidb_container_test.go. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…opy Clustered
Follow-up fixes from peer review of PR3b.
P0 — DB-level PLACEMENT POLICY was silently dropped end-to-end despite
being in the PR3 scope (handoff line 57, plan Task 3.6 Step 2). Wired
at three layers:
- tidb/catalog/database.go: Database.PlacementPolicy field
- tidb/catalog/dbcmds.go: createDatabase + alterDatabase recognize
the "placement policy" option
- tidb/parser/create_database.go: completion anchor added to
parseDatabaseOption() so PLACEMENT shows
up as a candidate inside CREATE/ALTER
DATABASE option lists
Coverage: TestWTTiDB_2_1_DatabasePlacementPolicy + a completion test
TestTiDBKeywords_CreateDatabasePlacement.
P1a — Constraint.Clustered *bool was shallow-copied in cloneTable,
leaving the rollback path in execAlterTable (multi-command ALTER)
vulnerable to future write-through-pointer mutations leaking between
the snapshot and live table. Pattern-matched to the existing
Column.Default and Column.Generated deep-copy handling in the same
function. Today's code never writes *con.Clustered = ..., so the
leak was theoretical — but the asymmetry with the other *-typed
fields was a footgun for future contributors.
P1b — ALTER TABLE ADD PRIMARY KEY (...) CLUSTERED silently dropped
the flag at both sites:
- altercmds.go:179-185 (inline column ADD COLUMN ... PRIMARY KEY
CLUSTERED) — now hoists cc.Clustered onto the synthesized table-
level constraint, mirroring the CREATE path in tablecmds.go
- altercmds.go:408-414 (table-level ADD PRIMARY KEY (...) CLUSTERED)
— now propagates con.Clustered, mirroring the CREATE path
Coverage: TestWTTiDB_2_2_AlterAddPKClustered,
TestWTTiDB_2_2b_AlterAddPKNonClustered (tests the &false value, not
just &true), TestWTTiDB_2_3_AlterAddColumnInlinePKClustered.
Verification:
- go build ./tidb/... clean
- go vet ./tidb/... clean
- go test ./tidb/... -short -count=1 all passing
- go test ./tidb/catalog/ -run TestTiDBCatalogContainer -count=1
still passes against real TiDB v8.5.5 (7/7 cases)
Not fixed in this commit (P2 — defer to follow-up PRs):
- AUTO_RANDOM missing from column-constraint completion anchor
- CLUSTERED/NONCLUSTERED missing from PK-modifier completion anchor
- Lenient validation for AUTO_RANDOM on non-BIGINT, TTL_ENABLE
accepting garbage values, REMOVE TTL on no-TTL table (TiDB itself
validates at execution time; catalog leniency is consistent with
the MySQL catalog's MVP posture)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nt panic
TTL option capture at create_table.go:1407 sliced p.lexer.input
directly with absolute Loc bounds. When parseSingle runs a non-leading
segment of a multi-statement input (baseOffset > 0), the bounds exceed
the segment-local lexer.input length and the runtime panics:
runtime error: slice bounds out of range [:105] with length 80
Any migration script of the form
CREATE DATABASE d; CREATE TABLE t (...) TTL = col + INTERVAL 1 YEAR;
crashed the parser.
Fix: replace the raw slice with p.inputText(exprStart, exprEnd). That
helper already subtracts baseOffset before indexing, which is why
every other body-capture site in the parser uses it (create_view,
trigger, alter_misc, create_function). The TTL site was the sole
outlier.
Why we missed it:
- All TTL walkthrough and container tests feed a single statement,
so baseOffset is always 0 and the raw slice is fine.
- The container cross-validation sends SQL through TiDB in one go,
not through omni's multi-statement splitter.
Regression test: tidb_ttl_multistmt_test.go parses
CREATE DATABASE dummy; CREATE TABLE t (...) TTL = ...;
and asserts the TTL option captures the expression substring.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rebelice
approved these changes
Apr 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Builds on PR3a (#99). Adds the TiDB-specific behavior on top of the mechanical fork: collation flip, new catalog fields, option wiring, completion keywords, walkthrough tests, TiDB container tests, and skip-annotations for four MySQL-only tests.
Based on:
feat/tidb-catalog-scaffold(PR #99). Merge PR3a first, then rebase this onto main.What's in the diff
Four logical commits:
6ee0fb4— feat(tidb): add TiDB-specific catalog fields and walkthrough tests (326 +/-2)utf8mb4_bintablecmds.go+altercmds.goextract TiDB options; ATSetTiFlashReplica + ATRemoveTTL handlersparser.ParseExpr+extractTTLPartsAST walker for TTL (no string parsing)wt_tidb_1_test.go(7 feature + 1 multi-feature interaction + 1 DATE_ADD regression)c1cc0ca— test(tidb): annotate MySQL-specific tests with TiDB behavioral differences (5 lines)show_test.go, one 2-subtest case inwt_13_2_test.got.Skip+ a concrete reasone32d182— feat(tidb): add TiDB keywords to completion candidates (105/-3)parseTableOption()collection point2e7718e— test(tidb): add catalog container tests against real TiDB v8.5.5 (219)pingcap/tidb:v8.5.5-short; uses samesync.Oncepattern astidb/parser/tidb_container_test.goTest plan
go build ./tidb/...cleango vet ./tidb/...cleango test ./tidb/... -short -count=1— all passinggo test ./tidb/catalog/ -run TestWTTiDB -count=1— 9/9 passinggo test ./tidb/completion/ -run TestTiDBKeywords -count=1— 3/3 passinggo test ./tidb/catalog/ -run TestTiDBCatalogContainer -count=1— 7/7 passing against real TiDB v8.5.5tidb/parser/tidb_container_test.gostill passKnown issue (pre-existing, not a PR3b regression)
go test ./tidb/catalog/ -count=1(full suite with containers, no-short) hangs at 10 minutes due to inherited MySQL container tests. The same hang reproduces onmysql/catalogon main (180s timeout hit with an identical test subset). This is a testcontainers-go / Docker resource issue when starting many containers per test binary. Workarounds:-short(fast unit tests only)-run TestWTTiDB|TestTiDBCatalogContainerNot introduced by this PR — confirmed pre-existing on
main.Deliberately deferred (future PRs)
mysql/semantic/forkPlan reference
Full plan:
plans/2026-04-10-omni-tidb-implementation.md— PR3 section (tasks 3.5-3.10).🤖 Generated with Claude Code