Skip to content

feat(tidb): catalog TiDB extensions — fields, wiring, completion, tests (PR3b)#100

Merged
vsai12 merged 6 commits intofeat/tidb-catalog-scaffoldfrom
feat/tidb-catalog-extensions
Apr 21, 2026
Merged

feat(tidb): catalog TiDB extensions — fields, wiring, completion, tests (PR3b)#100
vsai12 merged 6 commits intofeat/tidb-catalog-scaffoldfrom
feat/tidb-catalog-extensions

Conversation

@vsai12
Copy link
Copy Markdown
Contributor

@vsai12 vsai12 commented Apr 20, 2026

Summary

Builds on PR3a (#99). Adds the TiDB-specific behavior on top of the mechanical fork: collation flip, new catalog fields, option wiring, completion keywords, walkthrough tests, TiDB container tests, and skip-annotations for four MySQL-only tests.

Based on: feat/tidb-catalog-scaffold (PR #99). Merge PR3a first, then rebase this onto main.

What's in the diff

Four logical commits:

  1. 6ee0fb4feat(tidb): add TiDB-specific catalog fields and walkthrough tests (326 +/-2)

    • Collation default flipped to utf8mb4_bin
    • Table/Column/Constraint schema extensions
    • tablecmds.go + altercmds.go extract TiDB options; ATSetTiFlashReplica + ATRemoveTTL handlers
    • Inline CLUSTERED PK hoisted onto synthesized table-level constraint (separate code path from table-level CLUSTERED)
    • Public parser.ParseExpr + extractTTLParts AST walker for TTL (no string parsing)
    • 9 walkthrough tests in wt_tidb_1_test.go (7 feature + 1 multi-feature interaction + 1 DATE_ADD regression)
  2. c1cc0catest(tidb): annotate MySQL-specific tests with TiDB behavioral differences (5 lines)

    • Only 4 tests broke on the collation flip: 3 in show_test.go, one 2-subtest case in wt_13_2_test.go
    • All 4 annotated with t.Skip + a concrete reason
    • Pre-triage grep predicted 30+ files; actual failure surface was 4. The MySQL-only DDL tests (DEFINER/SQL SECURITY/ALGORITHM/TRIGGER/FULLTEXT) all still pass because TiDB parser accepts MySQL syntax and the catalog handles it identically.
  3. e32d182feat(tidb): add TiDB keywords to completion candidates (105/-3)

    • 8 table-option keywords added at parseTableOption() collection point
    • SET + REMOVE added at two ALTER TABLE collection points
    • Pinned negative test: TiKV must NOT appear in ENGINE= completion
  4. 2e7718etest(tidb): add catalog container tests against real TiDB v8.5.5 (219)

    • 7 cross-validation cases running lockstep against pingcap/tidb:v8.5.5
    • Skipped in -short; uses same sync.Once pattern as tidb/parser/tidb_container_test.go

Test plan

  • go build ./tidb/... clean
  • go vet ./tidb/... clean
  • go test ./tidb/... -short -count=1 — all passing
  • go test ./tidb/catalog/ -run TestWTTiDB -count=1 — 9/9 passing
  • go test ./tidb/completion/ -run TestTiDBKeywords -count=1 — 3/3 passing
  • go test ./tidb/catalog/ -run TestTiDBCatalogContainer -count=1 — 7/7 passing against real TiDB v8.5.5
  • Key TiDB cases in tidb/parser/tidb_container_test.go still pass
  • CI green on merge

Known issue (pre-existing, not a PR3b regression)

go test ./tidb/catalog/ -count=1 (full suite with containers, no -short) hangs at 10 minutes due to inherited MySQL container tests. The same hang reproduces on mysql/catalog on main (180s timeout hit with an identical test subset). This is a testcontainers-go / Docker resource issue when starting many containers per test binary. Workarounds:

  • Run with -short (fast unit tests only)
  • Run TiDB-specific cases in isolation: -run TestWTTiDB|TestTiDBCatalogContainer
  • Run individual inherited MySQL container tests in isolation (each passes cleanly on its own)

Not introduced by this PR — confirmed pre-existing on main.

Deliberately deferred (future PRs)

  • DB-level SET TIFLASH REPLICA (parser/catalog/completion/tests all excluded per plan)
  • CREATE/ALTER/DROP PLACEMENT POLICY DDL
  • FLASHBACK TABLE/DATABASE/CLUSTER
  • mysql/semantic/ fork
  • Diff + migration system (port from pg/catalog)

Plan reference

Full plan: plans/2026-04-10-omni-tidb-implementation.md — PR3 section (tasks 3.5-3.10).

🤖 Generated with Claude Code

vsai12 and others added 5 commits April 20, 2026 12:13
PR3b core: adds TiDB-specific table/column/constraint metadata and wires
the parser's TiDB options through to the catalog.

Collation:
- tidb/catalog/catalog.go New() default collation → utf8mb4_bin
  (TiDB v8.5 default; MySQL 8.0 uses utf8mb4_0900_ai_ci)
- Hardcoded, no setter — YAGNI until a customer requests configurability

Schema extensions:
- Table: ShardRowIDBits, PreSplitRegions, AutoIDCache, AutoRandomBase,
  PlacementPolicy, TTLColumn/Interval/Enable/JobInterval, TiFlashReplica
- Column: AutoRandom, AutoRandomShardBits, AutoRandomRangeBits
- Constraint: Clustered *bool (nil=unset, true=CLUSTERED, false=NONCLUSTERED)

Option wiring (tablecmds.go + altercmds.go):
- CREATE TABLE + ALTER TABLE extract shard_row_id_bits, pre_split_regions,
  auto_id_cache, auto_random_base, placement policy, ttl, ttl_enable,
  ttl_job_interval
- AUTO_RANDOM attribute populates column fields
- CLUSTERED/NONCLUSTERED on table-level PK AND inline-PK path (hoisted
  from ColumnConstraint.Clustered onto synthesized Constraint.Clustered)
- ALTER TABLE SET TIFLASH REPLICA sets Table.TiFlashReplica
- ALTER TABLE REMOVE TTL clears all TTL fields

TTL extraction:
- Added public parser.ParseExpr for re-parsing the raw TTL expression
  string stored by PR2 on TableOption.Value
- extractTTLParts walks the AST to extract (TTLColumn, TTLInterval),
  handling `<col> + INTERVAL <n> <unit>` and `DATE_ADD(<col>, INTERVAL
  <n> <unit>)` shapes; unrecognized shapes return a validation error
  rather than silently truncating

Walkthrough tests (wt_tidb_1_test.go):
- 1.1 AUTO_RANDOM + SHARD_ROW_ID_BITS
- 1.2 PLACEMENT POLICY (table)
- 1.3 TTL + TTL_ENABLE
- 1.4 ALTER TABLE SET TIFLASH REPLICA
- 1.5 ALTER TABLE REMOVE TTL (clears all TTL fields)
- 1.6  table-level CLUSTERED PK
- 1.6b inline CLUSTERED PK (hoist path — distinct code path)
- 1.7  multi-feature interaction (catches cross-option state leakage)
- 1.8  TTL with DATE_ADD shape (regression coverage for PR2's comma bug)

All tests pass. go vet clean. MySQL-inherited tests still pass except for
four SHOW-CREATE-TABLE-fidelity tests that assert literal utf8mb4_0900_ai_ci;
those are annotated in a follow-up commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ences

After flipping the catalog's default collation from utf8mb4_0900_ai_ci to
utf8mb4_bin (TiDB's v8.5 default), four SHOW-CREATE-TABLE fidelity tests
now fail because they assert exact MySQL-8.0 output. The behavioral
difference is expected, not a bug in the fork.

Annotated with t.Skip and a concrete reason:
- show_test.go/TestShowCreateTableBasic — literal "utf8mb4_0900_ai_ci"
- show_test.go/TestShowCreateTableDefaults — "varchar(50) DEFAULT ..."
  now shows per-column COLLATE utf8mb4_bin
- show_test.go/TestShowCreateTableNotNullNoDefault — same root cause
- wt_13_2_test.go (two subtests) — same root cause

Future: a TiDB-specific SHOW CREATE TABLE fidelity test belongs in
wt_tidb_*_test.go, asserting against TiDB's actual output format.

Pre-triage (utf8mb4_0900_ai_ci + DEFINER/SQL SECURITY/ALGORITHM/TRIGGER/
FULLTEXT/SPATIAL) flagged far more files than actually broke — MySQL-only
DDL tests all still pass on TiDB because the TiDB parser accepts MySQL
syntax and the catalog handles it identically. Only the collation-
dependent SHOW-CREATE fidelity assertions were sensitive to the flip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the parser's completion-mode keyword emission so TiDB-specific
options surface in the candidate list at the correct grammar positions.
No new rule-to-resolver plumbing is needed — the keywords were already
in the lexer (PR2), they just weren't being offered by the completion
walker.

Grammar positions updated in tidb/parser:

- create_table.go parseTableOption() now emits SHARD_ROW_ID_BITS,
  PRE_SPLIT_REGIONS, AUTO_ID_CACHE, AUTO_RANDOM_BASE, TTL, TTL_ENABLE,
  TTL_JOB_INTERVAL, PLACEMENT alongside MySQL's ENGINE/DEFAULT/etc.
- alter_table.go (two positions: ALTER TABLE command-list after the
  table name, and parseAlterTableCmd) now emits SET and REMOVE to
  anchor SET TIFLASH REPLICA and REMOVE TTL.

Deliberately NOT added:

- TiKV is NOT added to the ENGINE= candidate list. TiDB parses ENGINE=
  for MySQL compatibility but ignores the value; suggesting TiKV would
  mislead users. A pinned negative test enforces this.
- AUTO_RANDOM is not added to the column_constraint candidate list.
  The existing column-constraint path surfaces AUTO_INCREMENT alongside
  the full set of column constraint tokens; AUTO_RANDOM would need a
  separate completion anchor and there's no column_constraint-level
  completion call site in the fork. Revisit if users report missing
  AUTO_RANDOM completion.
- CLUSTERED/NONCLUSTERED on PK — no PK-modifier completion anchor
  exists. Same deferral reasoning.
- charset/collation lists are untouched. MySQL's list is a superset of
  TiDB's practical set; removing entries risks breaking valid MySQL-
  compat DDL that TiDB accepts.

Tests (tidb/completion/tidb_keywords_test.go):
- TestTiDBKeywords_TableOptions — at least one TiDB table option
  appears after a CREATE TABLE column list
- TestTiDBKeywords_AlterTable — SET or REMOVE appears in ALTER TABLE
  command position
- TestTiDBKeywords_NoTiKVEngine — pinned negative test

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cross-validates that omni's catalog and a real TiDB instance accept (or
reject) the same TiDB-specific DDL in lockstep. Per-case flow:

1. Run optional setup SQL on TiDB.
2. Run the test SQL on TiDB — if TiDB rejects, fail.
3. Run setup+test on a fresh omni Catalog via Exec() — if catalog
   rejects, fail.
4. Clean up on TiDB.

Lockstep over information_schema comparison: deep property checks
belong in a future diff/migration PR. This test's goal is to catch
drift between parser/catalog acceptance and real TiDB acceptance —
which is exactly what the first run caught: three of the original
cases combined options that TiDB actually rejects.

Cases covered:
- auto_random on PK
- shard_row_id_bits (no explicit PK — TiDB rejects combo w/ PK-as-row-id)
- ttl create
- tiflash replica 0 (replica>0 requires TiFlash servers, unhermetic)
- remove ttl alter
- clustered pk
- multi-feature interaction

Deferred:
- CREATE PLACEMENT POLICY DDL — catalog does not model policies as
  first-class objects yet; the test skips when catalog setup fails
  on unmodeled DDL rather than hard-coding a behavior difference.

Skipped in `-short` mode. Uses the same sync.Once testcontainers
pattern as tidb/parser/tidb_container_test.go.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…opy Clustered

Follow-up fixes from peer review of PR3b.

P0 — DB-level PLACEMENT POLICY was silently dropped end-to-end despite
being in the PR3 scope (handoff line 57, plan Task 3.6 Step 2). Wired
at three layers:

- tidb/catalog/database.go: Database.PlacementPolicy field
- tidb/catalog/dbcmds.go:   createDatabase + alterDatabase recognize
                            the "placement policy" option
- tidb/parser/create_database.go: completion anchor added to
                            parseDatabaseOption() so PLACEMENT shows
                            up as a candidate inside CREATE/ALTER
                            DATABASE option lists

Coverage: TestWTTiDB_2_1_DatabasePlacementPolicy + a completion test
TestTiDBKeywords_CreateDatabasePlacement.

P1a — Constraint.Clustered *bool was shallow-copied in cloneTable,
leaving the rollback path in execAlterTable (multi-command ALTER)
vulnerable to future write-through-pointer mutations leaking between
the snapshot and live table. Pattern-matched to the existing
Column.Default and Column.Generated deep-copy handling in the same
function. Today's code never writes *con.Clustered = ..., so the
leak was theoretical — but the asymmetry with the other *-typed
fields was a footgun for future contributors.

P1b — ALTER TABLE ADD PRIMARY KEY (...) CLUSTERED silently dropped
the flag at both sites:

- altercmds.go:179-185 (inline column ADD COLUMN ... PRIMARY KEY
  CLUSTERED) — now hoists cc.Clustered onto the synthesized table-
  level constraint, mirroring the CREATE path in tablecmds.go
- altercmds.go:408-414 (table-level ADD PRIMARY KEY (...) CLUSTERED)
  — now propagates con.Clustered, mirroring the CREATE path

Coverage: TestWTTiDB_2_2_AlterAddPKClustered,
TestWTTiDB_2_2b_AlterAddPKNonClustered (tests the &false value, not
just &true), TestWTTiDB_2_3_AlterAddColumnInlinePKClustered.

Verification:
- go build ./tidb/... clean
- go vet ./tidb/... clean
- go test ./tidb/... -short -count=1 all passing
- go test ./tidb/catalog/ -run TestTiDBCatalogContainer -count=1
  still passes against real TiDB v8.5.5 (7/7 cases)

Not fixed in this commit (P2 — defer to follow-up PRs):
- AUTO_RANDOM missing from column-constraint completion anchor
- CLUSTERED/NONCLUSTERED missing from PK-modifier completion anchor
- Lenient validation for AUTO_RANDOM on non-BIGINT, TTL_ENABLE
  accepting garbage values, REMOVE TTL on no-TTL table (TiDB itself
  validates at execution time; catalog leniency is consistent with
  the MySQL catalog's MVP posture)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vsai12 vsai12 requested review from rebelice April 20, 2026 22:53
…nt panic

TTL option capture at create_table.go:1407 sliced p.lexer.input
directly with absolute Loc bounds. When parseSingle runs a non-leading
segment of a multi-statement input (baseOffset > 0), the bounds exceed
the segment-local lexer.input length and the runtime panics:

    runtime error: slice bounds out of range [:105] with length 80

Any migration script of the form
  CREATE DATABASE d; CREATE TABLE t (...) TTL = col + INTERVAL 1 YEAR;
crashed the parser.

Fix: replace the raw slice with p.inputText(exprStart, exprEnd). That
helper already subtracts baseOffset before indexing, which is why
every other body-capture site in the parser uses it (create_view,
trigger, alter_misc, create_function). The TTL site was the sole
outlier.

Why we missed it:
- All TTL walkthrough and container tests feed a single statement,
  so baseOffset is always 0 and the raw slice is fine.
- The container cross-validation sends SQL through TiDB in one go,
  not through omni's multi-statement splitter.

Regression test: tidb_ttl_multistmt_test.go parses
  CREATE DATABASE dummy; CREATE TABLE t (...) TTL = ...;
and asserts the TTL option captures the expression substring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vsai12 vsai12 merged commit 082dc39 into feat/tidb-catalog-scaffold Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants