feat(tidb): catalog TiDB extensions — fields, wiring, completion, tests (PR3b) by vsai12 · Pull Request #100 · bytebase/omni

vsai12 · 2026-04-20T19:46:01Z

Summary

Builds on PR3a (#99). Adds the TiDB-specific behavior on top of the mechanical fork: collation flip, new catalog fields, option wiring, completion keywords, walkthrough tests, TiDB container tests, and skip-annotations for four MySQL-only tests.

Based on: feat/tidb-catalog-scaffold (PR #99). Merge PR3a first, then rebase this onto main.

What's in the diff

Four logical commits:

6ee0fb4 — feat(tidb): add TiDB-specific catalog fields and walkthrough tests (326 +/-2)
- Collation default flipped to utf8mb4_bin
- Table/Column/Constraint schema extensions
- tablecmds.go + altercmds.go extract TiDB options; ATSetTiFlashReplica + ATRemoveTTL handlers
- Inline CLUSTERED PK hoisted onto synthesized table-level constraint (separate code path from table-level CLUSTERED)
- Public parser.ParseExpr + extractTTLParts AST walker for TTL (no string parsing)
- 9 walkthrough tests in wt_tidb_1_test.go (7 feature + 1 multi-feature interaction + 1 DATE_ADD regression)
c1cc0ca — test(tidb): annotate MySQL-specific tests with TiDB behavioral differences (5 lines)
- Only 4 tests broke on the collation flip: 3 in show_test.go, one 2-subtest case in wt_13_2_test.go
- All 4 annotated with t.Skip + a concrete reason
- Pre-triage grep predicted 30+ files; actual failure surface was 4. The MySQL-only DDL tests (DEFINER/SQL SECURITY/ALGORITHM/TRIGGER/FULLTEXT) all still pass because TiDB parser accepts MySQL syntax and the catalog handles it identically.
e32d182 — feat(tidb): add TiDB keywords to completion candidates (105/-3)
- 8 table-option keywords added at parseTableOption() collection point
- SET + REMOVE added at two ALTER TABLE collection points
- Pinned negative test: TiKV must NOT appear in ENGINE= completion
2e7718e — test(tidb): add catalog container tests against real TiDB v8.5.5 (219)
- 7 cross-validation cases running lockstep against pingcap/tidb:v8.5.5
- Skipped in -short; uses same sync.Once pattern as tidb/parser/tidb_container_test.go

Test plan

go build ./tidb/... clean
go vet ./tidb/... clean
go test ./tidb/... -short -count=1 — all passing
go test ./tidb/catalog/ -run TestWTTiDB -count=1 — 9/9 passing
go test ./tidb/completion/ -run TestTiDBKeywords -count=1 — 3/3 passing
go test ./tidb/catalog/ -run TestTiDBCatalogContainer -count=1 — 7/7 passing against real TiDB v8.5.5
Key TiDB cases in tidb/parser/tidb_container_test.go still pass
CI green on merge

Known issue (pre-existing, not a PR3b regression)

go test ./tidb/catalog/ -count=1 (full suite with containers, no -short) hangs at 10 minutes due to inherited MySQL container tests. The same hang reproduces on mysql/catalog on main (180s timeout hit with an identical test subset). This is a testcontainers-go / Docker resource issue when starting many containers per test binary. Workarounds:

Run with -short (fast unit tests only)
Run TiDB-specific cases in isolation: -run TestWTTiDB|TestTiDBCatalogContainer
Run individual inherited MySQL container tests in isolation (each passes cleanly on its own)

Not introduced by this PR — confirmed pre-existing on main.

Deliberately deferred (future PRs)

DB-level SET TIFLASH REPLICA (parser/catalog/completion/tests all excluded per plan)
CREATE/ALTER/DROP PLACEMENT POLICY DDL
FLASHBACK TABLE/DATABASE/CLUSTER
mysql/semantic/ fork
Diff + migration system (port from pg/catalog)

Plan reference

Full plan: plans/2026-04-10-omni-tidb-implementation.md — PR3 section (tasks 3.5-3.10).

🤖 Generated with Claude Code

PR3b core: adds TiDB-specific table/column/constraint metadata and wires the parser's TiDB options through to the catalog. Collation: - tidb/catalog/catalog.go New() default collation → utf8mb4_bin (TiDB v8.5 default; MySQL 8.0 uses utf8mb4_0900_ai_ci) - Hardcoded, no setter — YAGNI until a customer requests configurability Schema extensions: - Table: ShardRowIDBits, PreSplitRegions, AutoIDCache, AutoRandomBase, PlacementPolicy, TTLColumn/Interval/Enable/JobInterval, TiFlashReplica - Column: AutoRandom, AutoRandomShardBits, AutoRandomRangeBits - Constraint: Clustered *bool (nil=unset, true=CLUSTERED, false=NONCLUSTERED) Option wiring (tablecmds.go + altercmds.go): - CREATE TABLE + ALTER TABLE extract shard_row_id_bits, pre_split_regions, auto_id_cache, auto_random_base, placement policy, ttl, ttl_enable, ttl_job_interval - AUTO_RANDOM attribute populates column fields - CLUSTERED/NONCLUSTERED on table-level PK AND inline-PK path (hoisted from ColumnConstraint.Clustered onto synthesized Constraint.Clustered) - ALTER TABLE SET TIFLASH REPLICA sets Table.TiFlashReplica - ALTER TABLE REMOVE TTL clears all TTL fields TTL extraction: - Added public parser.ParseExpr for re-parsing the raw TTL expression string stored by PR2 on TableOption.Value - extractTTLParts walks the AST to extract (TTLColumn, TTLInterval), handling `<col> + INTERVAL <n> <unit>` and `DATE_ADD(<col>, INTERVAL <n> <unit>)` shapes; unrecognized shapes return a validation error rather than silently truncating Walkthrough tests (wt_tidb_1_test.go): - 1.1 AUTO_RANDOM + SHARD_ROW_ID_BITS - 1.2 PLACEMENT POLICY (table) - 1.3 TTL + TTL_ENABLE - 1.4 ALTER TABLE SET TIFLASH REPLICA - 1.5 ALTER TABLE REMOVE TTL (clears all TTL fields) - 1.6 table-level CLUSTERED PK - 1.6b inline CLUSTERED PK (hoist path — distinct code path) - 1.7 multi-feature interaction (catches cross-option state leakage) - 1.8 TTL with DATE_ADD shape (regression coverage for PR2's comma bug) All tests pass. go vet clean. MySQL-inherited tests still pass except for four SHOW-CREATE-TABLE-fidelity tests that assert literal utf8mb4_0900_ai_ci; those are annotated in a follow-up commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ences After flipping the catalog's default collation from utf8mb4_0900_ai_ci to utf8mb4_bin (TiDB's v8.5 default), four SHOW-CREATE-TABLE fidelity tests now fail because they assert exact MySQL-8.0 output. The behavioral difference is expected, not a bug in the fork. Annotated with t.Skip and a concrete reason: - show_test.go/TestShowCreateTableBasic — literal "utf8mb4_0900_ai_ci" - show_test.go/TestShowCreateTableDefaults — "varchar(50) DEFAULT ..." now shows per-column COLLATE utf8mb4_bin - show_test.go/TestShowCreateTableNotNullNoDefault — same root cause - wt_13_2_test.go (two subtests) — same root cause Future: a TiDB-specific SHOW CREATE TABLE fidelity test belongs in wt_tidb_*_test.go, asserting against TiDB's actual output format. Pre-triage (utf8mb4_0900_ai_ci + DEFINER/SQL SECURITY/ALGORITHM/TRIGGER/ FULLTEXT/SPATIAL) flagged far more files than actually broke — MySQL-only DDL tests all still pass on TiDB because the TiDB parser accepts MySQL syntax and the catalog handles it identically. Only the collation- dependent SHOW-CREATE fidelity assertions were sensitive to the flip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Extends the parser's completion-mode keyword emission so TiDB-specific options surface in the candidate list at the correct grammar positions. No new rule-to-resolver plumbing is needed — the keywords were already in the lexer (PR2), they just weren't being offered by the completion walker. Grammar positions updated in tidb/parser: - create_table.go parseTableOption() now emits SHARD_ROW_ID_BITS, PRE_SPLIT_REGIONS, AUTO_ID_CACHE, AUTO_RANDOM_BASE, TTL, TTL_ENABLE, TTL_JOB_INTERVAL, PLACEMENT alongside MySQL's ENGINE/DEFAULT/etc. - alter_table.go (two positions: ALTER TABLE command-list after the table name, and parseAlterTableCmd) now emits SET and REMOVE to anchor SET TIFLASH REPLICA and REMOVE TTL. Deliberately NOT added: - TiKV is NOT added to the ENGINE= candidate list. TiDB parses ENGINE= for MySQL compatibility but ignores the value; suggesting TiKV would mislead users. A pinned negative test enforces this. - AUTO_RANDOM is not added to the column_constraint candidate list. The existing column-constraint path surfaces AUTO_INCREMENT alongside the full set of column constraint tokens; AUTO_RANDOM would need a separate completion anchor and there's no column_constraint-level completion call site in the fork. Revisit if users report missing AUTO_RANDOM completion. - CLUSTERED/NONCLUSTERED on PK — no PK-modifier completion anchor exists. Same deferral reasoning. - charset/collation lists are untouched. MySQL's list is a superset of TiDB's practical set; removing entries risks breaking valid MySQL- compat DDL that TiDB accepts. Tests (tidb/completion/tidb_keywords_test.go): - TestTiDBKeywords_TableOptions — at least one TiDB table option appears after a CREATE TABLE column list - TestTiDBKeywords_AlterTable — SET or REMOVE appears in ALTER TABLE command position - TestTiDBKeywords_NoTiKVEngine — pinned negative test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Cross-validates that omni's catalog and a real TiDB instance accept (or reject) the same TiDB-specific DDL in lockstep. Per-case flow: 1. Run optional setup SQL on TiDB. 2. Run the test SQL on TiDB — if TiDB rejects, fail. 3. Run setup+test on a fresh omni Catalog via Exec() — if catalog rejects, fail. 4. Clean up on TiDB. Lockstep over information_schema comparison: deep property checks belong in a future diff/migration PR. This test's goal is to catch drift between parser/catalog acceptance and real TiDB acceptance — which is exactly what the first run caught: three of the original cases combined options that TiDB actually rejects. Cases covered: - auto_random on PK - shard_row_id_bits (no explicit PK — TiDB rejects combo w/ PK-as-row-id) - ttl create - tiflash replica 0 (replica>0 requires TiFlash servers, unhermetic) - remove ttl alter - clustered pk - multi-feature interaction Deferred: - CREATE PLACEMENT POLICY DDL — catalog does not model policies as first-class objects yet; the test skips when catalog setup fails on unmodeled DDL rather than hard-coding a behavior difference. Skipped in `-short` mode. Uses the same sync.Once testcontainers pattern as tidb/parser/tidb_container_test.go. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…opy Clustered Follow-up fixes from peer review of PR3b. P0 — DB-level PLACEMENT POLICY was silently dropped end-to-end despite being in the PR3 scope (handoff line 57, plan Task 3.6 Step 2). Wired at three layers: - tidb/catalog/database.go: Database.PlacementPolicy field - tidb/catalog/dbcmds.go: createDatabase + alterDatabase recognize the "placement policy" option - tidb/parser/create_database.go: completion anchor added to parseDatabaseOption() so PLACEMENT shows up as a candidate inside CREATE/ALTER DATABASE option lists Coverage: TestWTTiDB_2_1_DatabasePlacementPolicy + a completion test TestTiDBKeywords_CreateDatabasePlacement. P1a — Constraint.Clustered *bool was shallow-copied in cloneTable, leaving the rollback path in execAlterTable (multi-command ALTER) vulnerable to future write-through-pointer mutations leaking between the snapshot and live table. Pattern-matched to the existing Column.Default and Column.Generated deep-copy handling in the same function. Today's code never writes *con.Clustered = ..., so the leak was theoretical — but the asymmetry with the other *-typed fields was a footgun for future contributors. P1b — ALTER TABLE ADD PRIMARY KEY (...) CLUSTERED silently dropped the flag at both sites: - altercmds.go:179-185 (inline column ADD COLUMN ... PRIMARY KEY CLUSTERED) — now hoists cc.Clustered onto the synthesized table- level constraint, mirroring the CREATE path in tablecmds.go - altercmds.go:408-414 (table-level ADD PRIMARY KEY (...) CLUSTERED) — now propagates con.Clustered, mirroring the CREATE path Coverage: TestWTTiDB_2_2_AlterAddPKClustered, TestWTTiDB_2_2b_AlterAddPKNonClustered (tests the &false value, not just &true), TestWTTiDB_2_3_AlterAddColumnInlinePKClustered. Verification: - go build ./tidb/... clean - go vet ./tidb/... clean - go test ./tidb/... -short -count=1 all passing - go test ./tidb/catalog/ -run TestTiDBCatalogContainer -count=1 still passes against real TiDB v8.5.5 (7/7 cases) Not fixed in this commit (P2 — defer to follow-up PRs): - AUTO_RANDOM missing from column-constraint completion anchor - CLUSTERED/NONCLUSTERED missing from PK-modifier completion anchor - Lenient validation for AUTO_RANDOM on non-BIGINT, TTL_ENABLE accepting garbage values, REMOVE TTL on no-TTL table (TiDB itself validates at execution time; catalog leniency is consistent with the MySQL catalog's MVP posture) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nt panic TTL option capture at create_table.go:1407 sliced p.lexer.input directly with absolute Loc bounds. When parseSingle runs a non-leading segment of a multi-statement input (baseOffset > 0), the bounds exceed the segment-local lexer.input length and the runtime panics: runtime error: slice bounds out of range [:105] with length 80 Any migration script of the form CREATE DATABASE d; CREATE TABLE t (...) TTL = col + INTERVAL 1 YEAR; crashed the parser. Fix: replace the raw slice with p.inputText(exprStart, exprEnd). That helper already subtracts baseOffset before indexing, which is why every other body-capture site in the parser uses it (create_view, trigger, alter_misc, create_function). The TTL site was the sole outlier. Why we missed it: - All TTL walkthrough and container tests feed a single statement, so baseOffset is always 0 and the raw slice is fine. - The container cross-validation sends SQL through TiDB in one go, not through omni's multi-statement splitter. Regression test: tidb_ttl_multistmt_test.go parses CREATE DATABASE dummy; CREATE TABLE t (...) TTL = ...; and asserts the TTL option captures the expression substring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vsai12 and others added 5 commits April 20, 2026 12:13

vsai12 requested review from rebelice April 20, 2026 22:53

rebelice approved these changes Apr 21, 2026

View reviewed changes

vsai12 merged commit 082dc39 into feat/tidb-catalog-scaffold Apr 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tidb): catalog TiDB extensions — fields, wiring, completion, tests (PR3b)#100

feat(tidb): catalog TiDB extensions — fields, wiring, completion, tests (PR3b)#100
vsai12 merged 6 commits intofeat/tidb-catalog-scaffoldfrom
feat/tidb-catalog-extensions

vsai12 commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vsai12 commented Apr 20, 2026

Summary

What's in the diff

Test plan

Known issue (pre-existing, not a PR3b regression)

Deliberately deferred (future PRs)

Plan reference

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants