Problem
The command templates in templates/commands/ are rendered into every agent integration (Claude skills, Copilot prompts, Codex skills, …) and their full body is loaded into the model's context window on every command invocation. They are written generously — repeated boilerplate, rationale prose, multi-example lists — which costs real tokens on each /speckit.* run:
| Template |
Tokens before |
Tokens after |
Reduction |
analyze.md |
2,282 |
1,854 |
19% |
checklist.md |
4,460 |
3,183 |
29% |
clarify.md |
3,638 |
2,850 |
22% |
constitution.md |
1,764 |
1,310 |
26% |
implement.md |
2,810 |
2,282 |
19% |
plan.md |
1,635 |
1,207 |
26% |
specify.md |
3,824 |
3,024 |
21% |
tasks.md |
2,326 |
1,834 |
21% |
taskstoissues.md |
964 |
697 |
28% |
| Total |
23,703 |
18,241 |
23% |
(o200k_base tokenizer; template bodies as rendered into each integration.)
Across the nine commands that is roughly 5,500 tokens of context per full specify→implement cycle, paid by every Spec Kit user on every feature, multiplied across the rendered integrations.
Proposal
Editorial compression of the nine command templates, ~23% smaller on average, under hard invariants — this is a wording change, not a behavior change:
- frontmatter (including
handoffs:) byte-identical;
- every fenced block containing commands or literal output templates (including the extension-hook
EXECUTE_COMMAND: blocks) byte-identical;
- all placeholder tokens (
$ARGUMENTS, {curly}, bracketed placeholders) and .specify/ paths preserved;
- every numbered step and every MUST/NEVER/"do not" rule preserved with full semantics — only redundant prose, restatements, and surplus examples were cut (at least one example per concept retained).
Each compressed template was checked by a deterministic invariant script (frontmatter equality, placeholder/path/fenced-block inventory diff against the original) plus an adversarial line-by-line semantic review hunting for dropped or weakened rules.
Evidence
We run these compressed templates (rendered as Claude/Codex skills) in a production multi-repo SDD workspace. A full dogfood pass — specify feature creation → spec → plan → tasks → cross-artifact checks → implementation gates — behaved identically to the verbose originals: all scripts, hook output contracts, and artifact shapes unchanged.
uv run pytest passes on the compressed branch.
Suggested path
Since CONTRIBUTING.md asks for discussion before large template changes, this issue is that discussion. As a concrete starting point I've opened a small proof-of-concept PR compressing only templates/commands/checklist.md (the largest template): (opening immediately after this issue; link will follow in a comment). The remaining eight are ready on a branch and I'm happy to submit them in whatever granularity the maintainers prefer — per template, or as one reviewed batch — or to adjust the compression level if you'd rather keep specific sections verbose.
AI disclosure
Per the AI contributions policy: the compression drafts were produced with LLM assistance under the hard invariants above; the invariant checker is a deterministic script, and I personally reviewed and tested the results (pytest + a rendered end-to-end workflow run in a real workspace, details above).
Problem
The command templates in
templates/commands/are rendered into every agent integration (Claude skills, Copilot prompts, Codex skills, …) and their full body is loaded into the model's context window on every command invocation. They are written generously — repeated boilerplate, rationale prose, multi-example lists — which costs real tokens on each/speckit.*run:analyze.mdchecklist.mdclarify.mdconstitution.mdimplement.mdplan.mdspecify.mdtasks.mdtaskstoissues.md(o200k_base tokenizer; template bodies as rendered into each integration.)
Across the nine commands that is roughly 5,500 tokens of context per full specify→implement cycle, paid by every Spec Kit user on every feature, multiplied across the rendered integrations.
Proposal
Editorial compression of the nine command templates, ~23% smaller on average, under hard invariants — this is a wording change, not a behavior change:
handoffs:) byte-identical;EXECUTE_COMMAND:blocks) byte-identical;$ARGUMENTS,{curly}, bracketed placeholders) and.specify/paths preserved;Each compressed template was checked by a deterministic invariant script (frontmatter equality, placeholder/path/fenced-block inventory diff against the original) plus an adversarial line-by-line semantic review hunting for dropped or weakened rules.
Evidence
We run these compressed templates (rendered as Claude/Codex skills) in a production multi-repo SDD workspace. A full dogfood pass —
specifyfeature creation → spec → plan → tasks → cross-artifact checks → implementation gates — behaved identically to the verbose originals: all scripts, hook output contracts, and artifact shapes unchanged.uv run pytestpasses on the compressed branch.Suggested path
Since CONTRIBUTING.md asks for discussion before large template changes, this issue is that discussion. As a concrete starting point I've opened a small proof-of-concept PR compressing only
templates/commands/checklist.md(the largest template): (opening immediately after this issue; link will follow in a comment). The remaining eight are ready on a branch and I'm happy to submit them in whatever granularity the maintainers prefer — per template, or as one reviewed batch — or to adjust the compression level if you'd rather keep specific sections verbose.AI disclosure
Per the AI contributions policy: the compression drafts were produced with LLM assistance under the hard invariants above; the invariant checker is a deterministic script, and I personally reviewed and tested the results (pytest + a rendered end-to-end workflow run in a real workspace, details above).