Skip to content

Worktree semantic checks#66

Merged
cs01 merged 9 commits intomainfrom
worktree-semantic-checks
Feb 26, 2026
Merged

Worktree semantic checks#66
cs01 merged 9 commits intomainfrom
worktree-semantic-checks

Conversation

@cs01
Copy link
Owner

@cs01 cs01 commented Feb 26, 2026

Semantic analysis passes and native codegen hardening

Summary

Adds pre-codegen semantic analysis passes that catch errors which would produce silently wrong native code, plus fixes latent codegen bugs that caused memory corruption and LLVM UB-driven code pruning.

Changes

New: Semantic analysis passes (src/semantic/)

  • Closure mutation checker (closure-mutation-checker.ts) — ChadScript closures capture by value. Mutating a captured variable after the closure is created produces silently wrong results at runtime (the closure sees the old value). This pass walks the AST, tracks which variables are captured by arrow functions, and emits a compile error if any are reassigned after capture.

  • Union type checker (union-type-checker.ts) — The existing inline union check catches param: string | number but misses type alias unions like type Mixed = string | number. When a type alias union has members with different LLVM representations (e.g., i8* vs double), codegen emits the alias name literally as the LLVM param type, defaulting to i8* — causing a segfault if the caller passes a double. This pass resolves aliases and rejects mixed-representation unions at compile time.

Both passes are called from LLVMGenerator.generateParts() before IR generation begins.

Fix: Native self-hosting crashes

  • scanExprForCaptures — Object and Map expression cases used inline as { key, value } type assertions that didn't match the real AST struct layout. Changed to use the named ObjectProperty and MapEntry types from ast/types.ts for correct native GEP offsets.

  • Capture/destructuredNames casts — Added explicit casts to avoid union-type codegen issues in the native compiler where the type is used as a parameter.

Fix: Missing setUsesJson in JSON array index codegen

  • IndexAccessGenerator emitted call @csyyjson_arr_get for JSON array indexing without calling ctx.setUsesJson(true). This meant the declare for the yyjson C bridge functions was never emitted, causing an LLVM "undefined value" error.

  • Consolidated setUsesJson calls to fix a Stage 1 self-hosting crash where the flag was set too late.

Fix: LLVMGenerator.reset() missing 4 field resets

The reset() override was missing 4 fields that BaseGenerator.reset() resets:

  • allocaCounter — alloca naming drifts across function compilations
  • allocaInstructions — safety reset (normally cleared after hoisting)
  • actualClassTypes — stale class-to-interface mappings can cause wrong GEP offsets (UB source)
  • currentDebugLocId — stale debug metadata

Fix: Expression orchestrator null pointer generation → hard error

The orchestrator (orchestrator.ts) silently generated inttoptr i64 0 to i8* (null) for expression types it didn't recognize. When these nulls flowed into global variable initialization, LLVM -O2 detected the null dereference as undefined behavior and was free to prune entire code paths — cascading to delete initialization of unrelated globals like BUILTIN_TYPES (crash: mov 0x8,%eax loading StringArray.length from null+8).

Both fallback paths (empty type, unsupported type) now call ctx.emitError() which returns never — stopping compilation immediately before any broken IR is emitted.

Docs: rules.md updates

  • Added src/semantic/ to Key Directories table
  • Added "Semantic Analysis Passes" section with instructions for adding new passes
  • Added "LLVMGenerator.reset() Sync Requirement" rule
  • Added "Expression Orchestrator — No Silent Nulls" rule
  • Added codegen quick rule Add file embedding and multi-column SQLite support #8 about feature flags for gated extern calls

Files changed

File Change
src/semantic/closure-mutation-checker.ts New — closure post-capture mutation checker
src/semantic/union-type-checker.ts New — mixed-representation union type checker
src/codegen/llvm-generator.ts Call semantic passes; fix reset() (4 missing fields)
src/codegen/expressions/orchestrator.ts Replace null fallbacks with hard emitError calls
src/codegen/expressions/access/index.ts Add missing setUsesJson for JSON array index
.claude/rules.md Document semantic passes, reset sync, orchestrator rule
tests/fixtures/closures/closure-capture-by-value-ok.ts New — test: valid capture-then-use
tests/fixtures/closures/closure-capture-mutation-error.ts New — test: post-capture mutation error
tests/fixtures/types/union-non-nullable-error.ts New — test: non-nullable union error
tests/fixtures/types/union-type-alias-error.ts New — test: type alias union error

Test plan

  • npm run build — TypeScript compilation passes
  • npm test — 248/248 tests pass (includes 4 new fixture tests)
  • bash scripts/self-hosting.sh — full 3-stage self-hosting passes (Stage 0 → Stage 1 → Stage 2)

@cs01 cs01 merged commit 3e1a90e into main Feb 26, 2026
12 checks passed
@cs01 cs01 deleted the worktree-semantic-checks branch February 26, 2026 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant