Skip to content

Proposal: compress command templates ~23% (same semantics, fewer tokens per invocation) #2943

@Xopoko

Description

@Xopoko

Problem

The command templates in templates/commands/ are rendered into every agent integration (Claude skills, Copilot prompts, Codex skills, …) and their full body is loaded into the model's context window on every command invocation. They are written generously — repeated boilerplate, rationale prose, multi-example lists — which costs real tokens on each /speckit.* run:

Template Tokens before Tokens after Reduction
analyze.md 2,282 1,854 19%
checklist.md 4,460 3,183 29%
clarify.md 3,638 2,850 22%
constitution.md 1,764 1,310 26%
implement.md 2,810 2,282 19%
plan.md 1,635 1,207 26%
specify.md 3,824 3,024 21%
tasks.md 2,326 1,834 21%
taskstoissues.md 964 697 28%
Total 23,703 18,241 23%

(o200k_base tokenizer; template bodies as rendered into each integration.)

Across the nine commands that is roughly 5,500 tokens of context per full specify→implement cycle, paid by every Spec Kit user on every feature, multiplied across the rendered integrations.

Proposal

Editorial compression of the nine command templates, ~23% smaller on average, under hard invariants — this is a wording change, not a behavior change:

  • frontmatter (including handoffs:) byte-identical;
  • every fenced block containing commands or literal output templates (including the extension-hook EXECUTE_COMMAND: blocks) byte-identical;
  • all placeholder tokens ($ARGUMENTS, {curly}, bracketed placeholders) and .specify/ paths preserved;
  • every numbered step and every MUST/NEVER/"do not" rule preserved with full semantics — only redundant prose, restatements, and surplus examples were cut (at least one example per concept retained).

Each compressed template was checked by a deterministic invariant script (frontmatter equality, placeholder/path/fenced-block inventory diff against the original) plus an adversarial line-by-line semantic review hunting for dropped or weakened rules.

Evidence

We run these compressed templates (rendered as Claude/Codex skills) in a production multi-repo SDD workspace. A full dogfood pass — specify feature creation → spec → plan → tasks → cross-artifact checks → implementation gates — behaved identically to the verbose originals: all scripts, hook output contracts, and artifact shapes unchanged.

uv run pytest passes on the compressed branch.

Suggested path

Since CONTRIBUTING.md asks for discussion before large template changes, this issue is that discussion. As a concrete starting point I've opened a small proof-of-concept PR compressing only templates/commands/checklist.md (the largest template): (opening immediately after this issue; link will follow in a comment). The remaining eight are ready on a branch and I'm happy to submit them in whatever granularity the maintainers prefer — per template, or as one reviewed batch — or to adjust the compression level if you'd rather keep specific sections verbose.

AI disclosure

Per the AI contributions policy: the compression drafts were produced with LLM assistance under the hard invariants above; the invariant checker is a deterministic script, and I personally reviewed and tested the results (pytest + a rendered end-to-end workflow run in a real workspace, details above).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions