Skip to content

feat(cli): bundle agentv-dev skills and add agentv skills subcommand#1226

Closed
christso wants to merge 11 commits into
mainfrom
feat/1224-bundled-skills
Closed

feat(cli): bundle agentv-dev skills and add agentv skills subcommand#1226
christso wants to merge 11 commits into
mainfrom
feat/1224-bundled-skills

Conversation

@christso
Copy link
Copy Markdown
Collaborator

@christso christso commented May 7, 2026

Closes #1224

Summary

  • Adds agentv skills subcommand with list, get, path sub-commands
  • Bundles all six agentv-dev skills into the npm package at apps/cli/skills/dist/skills/ at build time
  • Converts marketplace plugin SKILL.md files to discovery stubs pointing to agentv skills get <name>
  • Updates installation docs so the canonical setup is npm install -g agentv alone

Changes

CLI subcommand (apps/cli/src/commands/skills/index.ts):

  • agentv skills list — list available skill names (--json)
  • agentv skills get <name> — print SKILL.md content (--full includes references/, --json)
  • agentv skills get --all — print all skills
  • agentv skills path [<name>] — print resolved skills directory

Resolution walks upward from the module file, validating by SKILL.md presence to avoid false matches on the src/commands/skills/ directory itself. Prefers dist/skills/ (production layout).

Build pipeline (apps/cli/tsup.config.ts):

  • onSuccess copies apps/cli/skills/dist/skills/ alongside templates and studio assets

Skill content (apps/cli/skills/):

  • Full skill content moved here as the single source of truth (was in plugins/agentv-dev/skills/)
  • Includes all six skills: agentv-bench, agentv-eval-writer, agentv-eval-review, agentv-governance, agentv-onboarding, agentv-trace-analyst, plus their references/, scripts/, agents/ subdirectories

Plugin stubs (plugins/agentv-dev/skills/*/SKILL.md):

  • Rewritten to discovery stubs: keep the description: frontmatter (for trigger matching) and redirect agents to agentv skills get <name>

Docs (apps/web/src/content/docs/docs/getting-started/installation.mdx):

  • Canonical setup is now npm install -g agentv + agentv skills get agentv-onboarding
  • allagents plugin block moved to optional "Claude Code Plugin" section

Red/Green UAT

Before (no skills subcommand):

$ agentv skills list
error: Unknown command "skills"

After:

$ agentv skills list
agentv-bench
agentv-eval-review
agentv-eval-writer
agentv-governance
agentv-onboarding
agentv-trace-analyst

$ agentv skills get agentv-bench | head -5
---
name: agentv-bench
description: >-
  Run AgentV evaluations and optimize agents through eval-driven iteration.

$ agentv skills get agentv-bench --json | python3 -c "import json,sys; d=json.load(sys.stdin); print(d['success'], d['data'][0]['name'])"
True agentv-bench

$ agentv skills get does-not-exist; echo "Exit: $?"
Error: skill 'does-not-exist' not found
Available skills: agentv-bench, agentv-eval-review, agentv-eval-writer, agentv-governance, agentv-onboarding, agentv-trace-analyst
Exit: 1

Test plan

  • bun run test — all 2346 tests pass (527 CLI + 1752 core + 67 eval)
  • 9 new unit tests covering: listSkillNames, readSkill (SKILL.md content, frontmatter with hidden:true, not-found, --full with references, --full with no extra dirs), collectDir
  • agentv skills list returns all 6 skill names
  • agentv skills get agentv-bench returns full 428-line SKILL.md
  • agentv skills get agentv-bench --json returns valid JSON with stable schema
  • agentv skills get does-not-exist exits 1 with error message
  • agentv skills path returns resolved dist/skills directory
  • Pre-push hook passed: build, typecheck, lint, tests, validate:examples all green

christso and others added 2 commits May 7, 2026 03:15
Skills are now bundled inside the CLI npm package (`apps/cli/skills/`
→ `dist/skills/` at build time), version-matched to the binary. A new
`agentv skills` subcommand serves the bundled content without any
separate plugin install step.

- `agentv skills list` — list available skill names (--json)
- `agentv skills get <name>` — print SKILL.md content (--full, --json)
- `agentv skills get --all` — print all skills
- `agentv skills path [<name>]` — print resolved skills directory

Resolution walks upward from the module file, validating by SKILL.md
presence to avoid false matches. Prefers `dist/skills/` (production
layout) over bare `skills/` (source layout).

The marketplace plugin SKILL.md files are converted to discovery stubs
that redirect agents to `agentv skills get <name>`. Full skill content
lives in `apps/cli/skills/` as the single source of truth.

Docs: update installation.mdx so the canonical setup is
`npm install -g agentv` alone; the allagents plugin step moves to an
optional "Claude Code Plugin" section.

Closes #1224

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 7, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 84d3ff7
Status: ✅  Deploy successful!
Preview URL: https://6f245b50.agentv.pages.dev
Branch Preview URL: https://feat-1224-bundled-skills.agentv.pages.dev

View logs

@christso
Copy link
Copy Markdown
Collaborator Author

christso commented May 7, 2026

Review: agentv skills subcommand

Verdict: PASS (approve — can't self-approve via API)

E2E Testing

All 6 commands tested against the built dist:

Command Result
agentv skills list Clean plain-text list of 6 skills
agentv skills get agentv-bench Full 428-line SKILL.md — excellent AI-consumable density
agentv skills get agentv-bench --full SKILL.md + 7 reference files, separated by --- references/foo.md --- headers
agentv skills path Returns resolved absolute path to skills dir
agentv skills get does-not-exist Error: skill 'does-not-exist' not found + lists available skills, exits 1
agentv skills get --all All 6 skills with === skill-name === separators

An AI agent reading only CLI output gets: what AgentV is, how to run evals, what graders exist, how to interpret results. The SKILL.md content is well-structured for AI consumption.

Code Review

All passing:

  • findSkillsDir() walk + isValidSkillsDir() validation correctly handles both source and dist layouts
  • tsup.config.ts onSuccess properly cleans (rmSync) then copies apps/cli/skills/dist/skills/
  • Plugin SKILL.md stubs preserve full description: frontmatter with clean redirect instructions
  • 9 unit tests with proper temp dir cleanup, covering all key paths
  • Wire format correct (success/data keys are single-word, no case issues)
  • No YAGNI violations — 3 subcommands, minimal flags, stateless filesystem reads

Non-blocking follow-ups

  1. agentv-onboarding/SKILL.md<skill-dir> placeholder never resolves itself. Add: "Run agentv skills path agentv-onboarding to get the skill directory path." Makes the skill self-sufficient without filesystem context.

  2. agentv-bench/SKILL.md — References agents/grader.md etc. but --full only includes references/. The skill should tell agents: "Run agentv skills path agentv-bench to locate <skill-dir>/agents/<name>.md."

  3. agentv-eval-review/SKILL.mdpython scripts/lint_eval.py is a bare relative path; consider $(agentv skills path agentv-eval-review)/scripts/lint_eval.py for portability.

These are plugin content improvements, not blockers for merging. Implementation is correct and clean.

christso added 3 commits May 7, 2026 13:13
Closes #1229.

- skills get <name> --ref <file>: load a single reference without --full.
  Searches references/, templates/, agents/, then the skill root. Auto-
  appends .md if the caller passed a bare name. --ref is incompatible
  with --all and takes precedence over --full.
- readSkill --full now also collects agents/ alongside references/ and
  templates/, so agent role definitions ship together with the skill.
- Drop scripts/ and assets/ from every bundled skill. Scripts already
  duplicated CLI behavior (onboard-agentv.sh ↔ agentv init,
  trajectory.html / eval_review.html ↔ agentv studio); lint_eval.py
  is replaced by an inline structural checklist in agentv-eval-review's
  SKILL.md until a dedicated 'agentv eval lint' lands.
- Refresh the affected SKILL.md files: agentv-onboarding now invokes
  agentv init directly (no platform script), agentv-eval-review
  inlines the deterministic checks the deleted lint script performed,
  and every skill documents 'skills get --ref <file>' / 'skills path'
  for selective reference loading.
- Tests: extend the skills unit test fixture to exercise agents/ and
  bare-root files; assert findRefFile lookup order, .md auto-append,
  and miss path.
…er pattern

Skills are now sourced from <repo-root>/skills-data/ instead of
apps/cli/skills/. This mirrors agent-browser's top-level skill-data/
layout and keeps user-authored content out of the CLI workspace.

- git mv apps/cli/skills → skills-data
- tsup.config.ts: srcSkillsDir now resolves to ../../skills-data
- skills-resolver in src/commands/skills/index.ts learns a third
  candidate name (skills-data/) so dev-mode source runs
  (bun apps/cli/src/cli.ts skills …) keep working without first
  building. Order at each ancestor: dist/skills/ → skills-data/ →
  skills/ (legacy fallback).
- Build output stays at dist/skills/, so the npm tarball is unchanged.
- Verified: bun run build, dist/skills/ populated, node dist/cli.js
  skills list / get --ref / path all return expected content. Source
  mode (no dist) also resolves via skills-data/.
christso added 6 commits May 10, 2026 13:46
When pipeline input or pipeline run detects a non-CLI target (subagent-as-target
mode), print actionable next steps for the orchestrating agent:

- Dispatch executor subagents per test case
- Run code graders via pipeline grade
- Dispatch LLM grader subagents (read agents/grader.md)
- Merge scores via pipeline bench

Also point to the full procedure reference:
  agentv skills get agentv-bench --ref subagent-pipeline

This addresses the gap where agents running in subagent mode had no visibility
into what to do after pipeline input extracted the test cases.
When the agent IS the target (subagent-as-target mode), the pipeline
guidance now tells the agent to grade its own outputs against criteria
rather than dispatching separate grader subagents.

The agent already IS the LLM — it can read its own response.md,
evaluate against criteria.md, and write llm_grader_results directly.

Updated:
- pipeline input: guidance says "grade your own responses"
- pipeline run: same guidance for subagent mode
- subagent-pipeline.md: clarifies self-grading in subagent mode
Revert over-correction — the main agent should NOT grade its own outputs.
Instead it spawns grader subagents (one per test x LLM grader pair) using
agents/grader.md as their instructions.

The orchestrating agent dispatches:
1. Executor subagents (one per test case)
2. Grader subagents (one per test x LLM grader pair)
3. Runs pipeline bench to merge scores

agents/grader.md defines the full grading procedure for spawned subagents.
…instructions

The main agent reads agents/grader.md and embeds its full content as
system instructions in each grader subagent prompt. Subagents do not
self-discover the file — they need it passed to them.
rubrics assertions are normalized to type: llm-grader with a rubrics
array by the grader parser. But writeGraderConfigs only wrote
prompt_content (empty for rubrics) and dropped the rubrics array.

Now includes the rubrics criteria array in llm_graders/<name>.json so
grader subagents can evaluate each criterion directly.
- eval run: print TIP about pipeline when target is claude-cli/copilot-cli
- pipeline --help: description now says use this for agent targets
- pipeline run --help: hints about executor subagents for agent targets

Previously Claude would default to eval run and never discover pipeline.
Now both the top-level help and the eval run output guide toward pipeline.
@christso
Copy link
Copy Markdown
Collaborator Author

Superseded by #1231

@christso christso closed this May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(cli): bundle agentv-dev skills into npm package and add "agentv skills" subcommand

1 participant