Three-stage scan pipeline + schema simplification#68
Open
nahiyankhan wants to merge 6 commits intomainfrom
Open
Three-stage scan pipeline + schema simplification#68nahiyankhan wants to merge 6 commits intomainfrom
nahiyankhan wants to merge 6 commits intomainfrom
Conversation
…st-map Land the bucket pipeline and consolidate to a four-tool topology. A scan now runs in three stages — topology (map.md) → objective (bucket.json) → subjective (expression.md) — all owned by ghost-expression. ghost.bucket/v1: new artifact catalogues every concrete design value with structured specs, occurrence counts, and deterministic content-hashed IDs. Schema, lint, merge, and fix-ids primitives live in @ghost/core. New ghost-expression verbs: - inventory [path] — raw repo signals JSON (migrated from ghost-map). - lint [file] — auto-detects expression.md / map.md / bucket.json. - bucket merge — deterministic concat with id-based dedup, idempotent. - bucket fix-ids — recompute row IDs from content; lets surveyor agents author rows with empty id fields and finalize in one pass. - scan-status [dir] — report per-stage state and recommended next stage. New skill recipes (ghost-expression bundle): - map.md — topology stage (migrated from ghost-map skill recipe). - survey.md — objective stage. LLM-driven extraction with dialect-specific grep strategies, exhaustiveness discipline, saturation predicate. - scan.md — meta-recipe orchestrating topology → survey → profile. Refactored: profile.md is now strictly the subjective interpretation stage (reads bucket.json as ground truth; cannot fabricate values). ghost-map package deleted. ghost.map/v1 schema/types moved to @ghost/core; inventory + lint moved to ghost-expression. ghost-fleet now imports map types from @ghost/core. CLAUDE.md, README.md, and docs IA updated to reflect the four-tool topology. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…veness Self-distance reported 17.5% on freshly authored expressions because loadExpression never backfilled `oklch` on hex-only palette colors. comparePalette then treated every color as fully unmatched and contributed distance 1.0 per color even when comparing an expression to itself. Two-layer fix: - loadExpression now backfills oklch deterministically (parseColorToOklch is pure: same hex → same oklch), so re-parsing produces identical in-memory shapes. - comparePalette resolves oklch on-the-fly when missing and falls back to hex-string equality for non-parseable values (CSS vars, opaque refs). Defensive against third-party producers that emit hex-only. Survey recipe tightened with a load-bearing exhaustiveness rule. Triggered by a dogfood scan that produced ~10% recall on components (6 rows for a 97-component package). The rule is repo-agnostic: the agent identifies the canonical signal in this repo, enumerates exhaustively, and cross-checks counts. Recipe explicitly forbids sampling and placeholder/glob library names. Coverage check (step 8) is now a real gate — exhaustiveness must match independent counts before declaring saturation. Pinned attempt-1 artifacts under dogfood/ghost-ui/attempt-1/ with a structured NOTES.md documenting the failure modes. Future attempts go alongside so we can track improvement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Recall delta vs attempt 1: - values: 22 → 190 (10x) - tokens: 11 → 238 (22x, ~99% of 240 declared in main.css) - components: 6 → 97 (16x, 100% of registry:ui items) - libraries: 6 → 42 (7x, every design-surface dep enumerated) Decision count: 7 → 11. New decisions added — chart-strategy, surface-hierarchy, font-sourcing, density, interactive-patterns — and renames adjusted toward pattern-naming (token-architecture → theming-architecture, shadow-hierarchy → elevation, with the count corrected from 4 tiers to 7). Self-distance: 0.0% (was 17.5%). Confirms the oklch backfill fix. Agent followed the recipe by writing build-bucket.mjs (pinned alongside artifacts) — walks main.css for tokens, registry.json for components, package.json for libraries. Reproducible. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
External libraries (icon sets, primitive collections, motion libs, charting) no longer have a top-level bucket section. Whether a system uses Radix or hand-rolls primitives doesn't change what its design language *is*; what matters surfaces elsewhere — font families show up as typography tokens, and load-bearing library choices (icon family, font sourcing) belong in interpreter prose under the relevant decision dimension. Bucket sections are now: values, tokens, components. Removed from @ghost/core exports: LibraryRow, LibraryRowSchema, libraryRowId. Lint, merge, and fix-ids no longer touch a libraries section. Skill recipes (survey.md, profile.md) updated — survey.md no longer instructs the agent to enumerate libraries; profile.md tells the agent that load-bearing library choices land in prose, not as structured rows. zod schema stays non-strict, so historical buckets that still carry a libraries field continue to lint clean (the field is simply ignored). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pology Six documentation files were carrying stale framing from the five-tool era or hadn't been updated for the bucket pipeline / libraries-cut. - packages/ghost-expression/README.md: full rewrite. Was framed around "four verbs" (lint/describe/diff/emit) with no mention of the scan pipeline. Now leads with the three-stage table (topology → objective → subjective), enumerates all seven verbs (inventory, scan-status, lint, describe, diff, bucket, emit), and points at all four skill recipes (scan, map, survey, profile). - packages/ghost-drift/README.md: dropped "five-tool decomposition" and the ghost-map reference. The "Authoring expression.md?" sidebar now surfaces inventory + scan-status + bucket merge alongside the existing lint/describe/diff/emit verbs. - CLAUDE.md / AGENTS.md (symlinked): bucket description no longer mentions "and libraries", scan-status added to the verbs table, scan recipe added to the workflows list. - apps/docs/src/content/docs/cli-reference.mdx: added scan-status and bucket sections under ghost-expression. Updated overview to seventeen verbs and added the scan recipe to the skill-recipes table. - apps/docs/src/content/docs/getting-started.mdx: tools table grew to include scan-status, added a three-stage diagram, replaced the single-step "Profile Your First System" with a stage-aware "Scan Your First System" walkthrough. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nents Slot → token bindings either fall out of decisions[] (pattern consequences) or live in bucket.json components[] (exhaustive catalog). The hybrid roles[] slot was filling neither cleanly, didn't scale to systems with many components, and the schema never committed on whether it was exemplary or exhaustive. Removes roles[] from the zod schema (.strict() now rejects it), Expression type, lint (broken-role-reference rule, slug-binding propagation, and the references.ts token resolver), profile/scan/schema/verify/review recipes, expression.template.md, the docs site, and every fixture (arcade, market, ghost-ui, fleet members, .scratch). unused-palette is simplified to check decision evidence/prose only. Also: ignore .scratch/ for dogfood scans, and ship the ghost-ui expression-fidelity test bundles (arcade + market) as new fixtures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
map.md(topology) →bucket.json(objective values) →expression.md(subjective interpretation), all owned byghost-expression.ghost-mapis folded back in; the four-tool topology is nowghost-expression/ghost-drift/ghost-fleet/ghost-ui.bucket.libraries[](icon/primitive choices aren't part of the design language; load-bearing library calls now surface as prose evidence) and dropped expressionroles[](slot bindings either fall out of decisions or live inbucket.components[]— neither pattern needs a parallel structure that doesn't scale).ghost-drift compare, aligned READMEs and the docs site with the new topology, shipped the ghost-ui expression-fidelity test bundles (arcade + market).Breaking changes
ghost-expressionis bumped major in this PR's changeset becauseroles[]is no longer a valid frontmatter field —.strict()rejects it. Existingexpression.mdfiles carryingroles:must drop the section to migrate. All in-repo fixtures have been updated.ghost-mapis gone — its verbs (inventory, the topology recipe) live underghost-expressionnow.bucket.jsonno longer accepts alibraries[]section.Test plan
pnpm buildcleanpnpm checkclean (biome + typecheck + file-size + cli-manifest)pnpm test— 291 tests across 30 files, all passing locallyexpression.md(ghost-ui, fleet members, arcade/market test bundles) against the new schemamap→survey→profile) and confirm it produces a cleanexpression.mdghost-drift compareself-distance is 0 on every fixture🤖 Generated with Claude Code