diff --git a/.agents/skills/company-creator/SKILL.md b/.agents/skills/company-creator/SKILL.md new file mode 100644 index 0000000..b94b0e6 --- /dev/null +++ b/.agents/skills/company-creator/SKILL.md @@ -0,0 +1,269 @@ +--- +name: company-creator +description: > + Create agent company packages conforming to the Agent Companies specification + (agentcompanies/v1). Use when a user wants to create a new agent company from + scratch, build a company around an existing git repo or skills collection, or + scaffold a team/department of agents. Triggers on: "create a company", "make me + a company", "build a company from this repo", "set up an agent company", + "create a team of agents", "hire some agents", or when given a repo URL and + asked to turn it into a company. Do NOT use for importing an existing company + package (use the CLI import command instead) or for modifying a company that + is already running in Taskcore. +--- + +# Company Creator + +Create agent company packages that conform to the Agent Companies specification. + +Spec references: + +- Normative spec: `docs/companies/companies-spec.md` (read this before generating files) +- Web spec: https://agentcompanies.io/specification +- Protocol site: https://agentcompanies.io/ + +## Two Modes + +### Mode 1: Company From Scratch + +The user describes what they want. Interview them to flesh out the vision, then generate the package. + +### Mode 2: Company From a Repo + +The user provides a git repo URL, local path, or tweet. Analyze the repo, then create a company that wraps it. + +See [references/from-repo-guide.md](references/from-repo-guide.md) for detailed repo analysis steps. + +## Process + +### Step 1: Gather Context + +Determine which mode applies: + +- **From scratch**: What kind of company or team? What domain? What should the agents do? +- **From repo**: Clone/read the repo. Scan for existing skills, agent configs, README, source structure. + +### Step 2: Interview (Use AskUserQuestion) + +Do not skip this step. Use AskUserQuestion to align with the user before writing any files. + +**For from-scratch companies**, ask about: + +- Company purpose and domain (1-2 sentences is fine) +- What agents they need - propose a hiring plan based on what they described +- Whether this is a full company (needs a CEO) or a team/department (no CEO required) +- Any specific skills the agents should have +- How work flows through the organization (see "Workflow" below) +- Whether they want projects and starter tasks + +**For from-repo companies**, present your analysis and ask: + +- Confirm the agents you plan to create and their roles +- Whether to reference or vendor any discovered skills (default: reference) +- Any additional agents or skills beyond what the repo provides +- Company name and any customization +- Confirm the workflow you inferred from the repo (see "Workflow" below) + +**Workflow — how does work move through this company?** + +A company is not just a list of agents with skills. It's an organization that takes ideas and turns them into work products. You need to understand the workflow so each agent knows: + +- Who gives them work and in what form (a task, a branch, a question, a review request) +- What they do with it +- Who they hand off to when they're done, and what that handoff looks like +- What "done" means for their role + +**Not every company is a pipeline.** Infer the right workflow pattern from context: + +- **Pipeline** — sequential stages, each agent hands off to the next. Use when the repo/domain has a clear linear process (e.g. plan → build → review → ship → QA, or content ideation → draft → edit → publish). +- **Hub-and-spoke** — a manager delegates to specialists who report back independently. Use when agents do different kinds of work that don't feed into each other (e.g. a CEO who dispatches to a researcher, a marketer, and an analyst). +- **Collaborative** — agents work together on the same things as peers. Use for small teams where everyone contributes to the same output (e.g. a design studio, a brainstorming team). +- **On-demand** — agents are summoned as needed with no fixed flow. Use when agents are more like a toolbox of specialists the user calls directly. + +For from-scratch companies, propose a workflow pattern based on what they described and ask if it fits. + +For from-repo companies, infer the pattern from the repo's structure. If skills have a clear sequential dependency (like `plan-ceo-review → plan-eng-review → review → ship → qa`), that's a pipeline. If skills are independent capabilities, it's more likely hub-and-spoke or on-demand. State your inference in the interview so the user can confirm or adjust. + +**Key interviewing principles:** + +- Propose a concrete hiring plan. Don't ask open-ended "what agents do you want?" - suggest specific agents based on context and let the user adjust. +- Keep it lean. Most users are new to agent companies. A few agents (3-5) is typical for a startup. Don't suggest 10+ agents unless the scope demands it. +- From-scratch companies should start with a CEO who manages everyone. Teams/departments don't need one. +- Ask 2-3 focused questions per round, not 10. + +### Step 3: Read the Spec + +Before generating any files, read the normative spec: + +``` +docs/companies/companies-spec.md +``` + +Also read the quick reference: [references/companies-spec.md](references/companies-spec.md) + +And the example: [references/example-company.md](references/example-company.md) + +### Step 4: Generate the Package + +Create the directory structure and all files. Follow the spec's conventions exactly. + +**Directory structure:** + +``` +/ +├── COMPANY.md +├── agents/ +│ └── /AGENTS.md +├── teams/ +│ └── /TEAM.md (if teams are needed) +├── projects/ +│ └── /PROJECT.md (if projects are needed) +├── tasks/ +│ └── /TASK.md (if tasks are needed) +├── skills/ +│ └── /SKILL.md (if custom skills are needed) +└── .taskcore.yaml (Taskcore vendor extension) +``` + +**Rules:** + +- Slugs must be URL-safe, lowercase, hyphenated +- COMPANY.md gets `schema: agentcompanies/v1` - other files inherit it +- Agent instructions go in the AGENTS.md body, not in .taskcore.yaml +- Skills referenced by shortname in AGENTS.md resolve to `skills//SKILL.md` +- For external skills, use `sources` with `usage: referenced` (see spec section 12) +- Do not export secrets, machine-local paths, or database IDs +- Omit empty/default fields +- For companies generated from a repo, add a references footer at the bottom of COMPANY.md body: + `Generated from [repo-name](repo-url) with the company-creator skill from [Taskcore](https://github.com/khulnasoft/taskcore)` + +**Reporting structure:** + +- Every agent except the CEO should have `reportsTo` set to their manager's slug +- The CEO has `reportsTo: null` +- For teams without a CEO, the top-level agent has `reportsTo: null` + +**Writing workflow-aware agent instructions:** + +Each AGENTS.md body should include not just what the agent does, but how they fit into the organization's workflow. Include: + +1. **Where work comes from** — "You receive feature ideas from the user" or "You pick up tasks assigned to you by the CTO" +2. **What you produce** — "You produce a technical plan with architecture diagrams" or "You produce a reviewed, approved branch ready for shipping" +3. **Who you hand off to** — "When your plan is locked, hand off to the Staff Engineer for implementation" or "When review passes, hand off to the Release Engineer to ship" +4. **What triggers you** — "You are activated when a new feature idea needs product-level thinking" or "You are activated when a branch is ready for pre-landing review" + +This turns a collection of agents into an organization that actually works together. Without workflow context, agents operate in isolation — they do their job but don't know what happens before or after them. + +### Step 5: Confirm Output Location + +Ask the user where to write the package. Common options: + +- A subdirectory in the current repo +- A new directory the user specifies +- The current directory (if it's empty or they confirm) + +### Step 6: Write README.md and LICENSE + +**README.md** — every company package gets a README. It should be a nice, readable introduction that someone browsing GitHub would appreciate. Include: + +- Company name and what it does +- The workflow / how the company operates +- Org chart as a markdown list or table showing agents, titles, reporting structure, and skills +- Brief description of each agent's role +- Citations and references: link to the source repo (if from-repo), link to the Agent Companies spec (https://agentcompanies.io/specification), and link to Taskcore (https://github.com/khulnasoft/taskcore) +- A "Getting Started" section explaining how to import: `taskcore company import --from ` + +**LICENSE** — include a LICENSE file. The copyright holder is the user creating the company, not the upstream repo author (they made the skills, the user is making the company). Use the same license type as the source repo (if from-repo) or ask the user (if from-scratch). Default to MIT if unclear. + +### Step 7: Write Files and Summarize + +Write all files, then give a brief summary: + +- Company name and what it does +- Agent roster with roles and reporting structure +- Skills (custom + referenced) +- Projects and tasks if any +- The output path + +## .taskcore.yaml Guidelines + +The `.taskcore.yaml` file is the Taskcore vendor extension. It configures adapters and env inputs per agent. + +### Adapter Rules + +**Do not specify an adapter unless the repo or user context warrants it.** If you don't know what adapter the user wants, omit the adapter block entirely — Taskcore will use its default. Specifying an unknown adapter type causes an import error. + +Taskcore's supported adapter types (these are the ONLY valid values): +- `claude_local` — Claude Code CLI +- `codex_local` — Codex CLI +- `opencode_local` — OpenCode CLI +- `pi_local` — Pi CLI +- `cursor` — Cursor +- `gemini_local` — Gemini CLI +- `openclaw_gateway` — OpenClaw gateway + +Only set an adapter when: +- The repo or its skills clearly target a specific runtime (e.g. gstack is built for Claude Code, so `claude_local` is appropriate) +- The user explicitly requests a specific adapter +- The agent's role requires a specific runtime capability + +### Env Inputs Rules + +**Do not add boilerplate env variables.** Only add env inputs that the agent actually needs based on its skills or role: +- `GH_TOKEN` for agents that push code, create PRs, or interact with GitHub +- API keys only when a skill explicitly requires them +- Never set `ANTHROPIC_API_KEY` as a default empty env variable — the runtime handles this + +Example with adapter (only when warranted): +```yaml +schema: taskcore/v1 +agents: + release-engineer: + adapter: + type: claude_local + config: + model: claude-sonnet-4-6 + inputs: + env: + GH_TOKEN: + kind: secret + requirement: optional +``` + +Example — only agents with actual overrides appear: +```yaml +schema: taskcore/v1 +agents: + release-engineer: + inputs: + env: + GH_TOKEN: + kind: secret + requirement: optional +``` + +In this example, only `release-engineer` appears because it needs `GH_TOKEN`. The other agents (ceo, cto, etc.) have no overrides, so they are omitted entirely from `.taskcore.yaml`. + +## External Skill References + +When referencing skills from a GitHub repo, always use the references pattern: + +```yaml +metadata: + sources: + - kind: github-file + repo: owner/repo + path: path/to/SKILL.md + commit: + attribution: Owner or Org Name + license: + usage: referenced +``` + +Get the commit SHA with: + +```bash +git ls-remote https://github.com/owner/repo HEAD +``` + +Do NOT copy external skill content into the package unless the user explicitly asks. diff --git a/.agents/skills/company-creator/references/companies-spec.md b/.agents/skills/company-creator/references/companies-spec.md new file mode 100644 index 0000000..55c11b0 --- /dev/null +++ b/.agents/skills/company-creator/references/companies-spec.md @@ -0,0 +1,144 @@ +# Agent Companies Specification Reference + +The normative specification lives at: + +- Web: https://agentcompanies.io/specification +- Local: docs/companies/companies-spec.md + +Read the local spec file before generating any package files. The spec defines the canonical format and all frontmatter fields. Below is a quick-reference summary for common authoring tasks. + +## Package Kinds + +| File | Kind | Purpose | +| ---------- | ------- | ------------------------------------------------- | +| COMPANY.md | company | Root entrypoint, org boundary and defaults | +| TEAM.md | team | Reusable org subtree | +| AGENTS.md | agent | One role, instructions, and attached skills | +| PROJECT.md | project | Planned work grouping | +| TASK.md | task | Portable starter task | +| SKILL.md | skill | Agent Skills capability package (do not redefine) | + +## Directory Layout + +``` +company-package/ +├── COMPANY.md +├── agents/ +│ └── /AGENTS.md +├── teams/ +│ └── /TEAM.md +├── projects/ +│ └── / +│ ├── PROJECT.md +│ └── tasks/ +│ └── /TASK.md +├── tasks/ +│ └── /TASK.md +├── skills/ +│ └── /SKILL.md +├── assets/ +├── scripts/ +├── references/ +└── .taskcore.yaml (optional vendor extension) +``` + +## Common Frontmatter Fields + +```yaml +schema: agentcompanies/v1 +kind: company | team | agent | project | task +slug: url-safe-stable-identity +name: Human Readable Name +description: Short description for discovery +version: 0.1.0 +license: MIT +authors: + - name: Jane Doe +tags: [] +metadata: {} +sources: [] +``` + +- `schema` usually appears only at package root +- `kind` is optional when filename makes it obvious +- `slug` must be URL-safe and stable +- exporters should omit empty or default-valued fields + +## COMPANY.md Required Fields + +```yaml +name: Company Name +description: What this company does +slug: company-slug +schema: agentcompanies/v1 +``` + +Optional: `version`, `license`, `authors`, `goals`, `includes`, `requirements.secrets` + +## AGENTS.md Key Fields + +```yaml +name: Agent Name +title: Role Title +reportsTo: +skills: + - skill-shortname +``` + +- Body content is the agent's default instructions +- Skills resolve by shortname: `skills//SKILL.md` +- Do not export machine-specific paths or secrets + +## TEAM.md Key Fields + +```yaml +name: Team Name +description: What this team does +slug: team-slug +manager: ../agent-slug/AGENTS.md +includes: + - ../agent-slug/AGENTS.md + - ../../skills/skill-slug/SKILL.md +``` + +## PROJECT.md Key Fields + +```yaml +name: Project Name +description: What this project delivers +owner: agent-slug +``` + +## TASK.md Key Fields + +```yaml +name: Task Name +assignee: agent-slug +project: project-slug +schedule: + timezone: America/Chicago + startsAt: 2026-03-16T09:00:00-05:00 + recurrence: + frequency: weekly + interval: 1 + weekdays: [monday] + time: { hour: 9, minute: 0 } +``` + +## Source References (for external skills/content) + +```yaml +sources: + - kind: github-file + repo: owner/repo + path: path/to/SKILL.md + commit: + sha256: + attribution: Owner Name + license: MIT + usage: referenced +``` + +Usage modes: `vendored` (bytes included), `referenced` (pointer only), `mirrored` (cached locally) + +Default to `referenced` for third-party content. diff --git a/.agents/skills/company-creator/references/example-company.md b/.agents/skills/company-creator/references/example-company.md new file mode 100644 index 0000000..c88f60f --- /dev/null +++ b/.agents/skills/company-creator/references/example-company.md @@ -0,0 +1,184 @@ +# Example Company Package + +A minimal but complete example of an agent company package. + +## Directory Structure + +``` +lean-dev-shop/ +├── COMPANY.md +├── agents/ +│ ├── ceo/AGENTS.md +│ ├── cto/AGENTS.md +│ └── engineer/AGENTS.md +├── teams/ +│ └── engineering/TEAM.md +├── projects/ +│ └── q2-launch/ +│ ├── PROJECT.md +│ └── tasks/ +│ └── monday-review/TASK.md +├── tasks/ +│ └── weekly-standup/TASK.md +├── skills/ +│ └── code-review/SKILL.md +└── .taskcore.yaml +``` + +## COMPANY.md + +```markdown +--- +name: Lean Dev Shop +description: Small engineering-focused AI company that builds and ships software products +slug: lean-dev-shop +schema: agentcompanies/v1 +version: 1.0.0 +license: MIT +authors: + - name: Example Org +goals: + - Build and ship software products + - Maintain high code quality +--- + +Lean Dev Shop is a small, focused engineering company. The CEO oversees strategy and coordinates work. The CTO leads the engineering team. Engineers build and ship code. +``` + +## agents/ceo/AGENTS.md + +```markdown +--- +name: CEO +title: Chief Executive Officer +reportsTo: null +skills: + - taskcore +--- + +You are the CEO of Lean Dev Shop. You oversee company strategy, coordinate work across the team, and ensure projects ship on time. + +Your responsibilities: + +- Review and prioritize work across projects +- Coordinate with the CTO on technical decisions +- Ensure the company goals are being met +``` + +## agents/cto/AGENTS.md + +```markdown +--- +name: CTO +title: Chief Technology Officer +reportsTo: ceo +skills: + - code-review + - taskcore +--- + +You are the CTO of Lean Dev Shop. You lead the engineering team and make technical decisions. + +Your responsibilities: + +- Set technical direction and architecture +- Review code and ensure quality standards +- Mentor engineers and unblock technical challenges +``` + +## agents/engineer/AGENTS.md + +```markdown +--- +name: Engineer +title: Software Engineer +reportsTo: cto +skills: + - code-review + - taskcore +--- + +You are a software engineer at Lean Dev Shop. You write code, fix bugs, and ship features. + +Your responsibilities: + +- Implement features and fix bugs +- Write tests and documentation +- Participate in code reviews +``` + +## teams/engineering/TEAM.md + +```markdown +--- +name: Engineering +description: Product and platform engineering team +slug: engineering +schema: agentcompanies/v1 +manager: ../../agents/cto/AGENTS.md +includes: + - ../../agents/engineer/AGENTS.md + - ../../skills/code-review/SKILL.md +tags: + - engineering +--- + +The engineering team builds and maintains all software products. +``` + +## projects/q2-launch/PROJECT.md + +```markdown +--- +name: Q2 Launch +description: Ship the Q2 product launch +slug: q2-launch +owner: cto +--- + +Deliver all features planned for the Q2 launch, including the new dashboard and API improvements. +``` + +## projects/q2-launch/tasks/monday-review/TASK.md + +```markdown +--- +name: Monday Review +assignee: ceo +project: q2-launch +schedule: + timezone: America/Chicago + startsAt: 2026-03-16T09:00:00-05:00 + recurrence: + frequency: weekly + interval: 1 + weekdays: + - monday + time: + hour: 9 + minute: 0 +--- + +Review the status of Q2 Launch project. Check progress on all open tasks, identify blockers, and update priorities for the week. +``` + +## skills/code-review/SKILL.md (with external reference) + +```markdown +--- +name: code-review +description: Thorough code review skill for pull requests and diffs +metadata: + sources: + - kind: github-file + repo: anthropics/claude-code + path: skills/code-review/SKILL.md + commit: abc123def456 + sha256: 3b7e...9a + attribution: Anthropic + license: MIT + usage: referenced +--- + +Review code changes for correctness, style, and potential issues. +``` diff --git a/.agents/skills/company-creator/references/from-repo-guide.md b/.agents/skills/company-creator/references/from-repo-guide.md new file mode 100644 index 0000000..b945869 --- /dev/null +++ b/.agents/skills/company-creator/references/from-repo-guide.md @@ -0,0 +1,79 @@ +# Creating a Company From an Existing Repository + +When a user provides a git repo (URL, local path, or tweet linking to a repo), analyze it and create a company package that wraps its content. + +## Analysis Steps + +1. **Clone or read the repo** - Use `git clone` for URLs, read directly for local paths +2. **Scan for existing agent/skill files** - Look for SKILL.md, AGENTS.md, CLAUDE.md, .claude/ directories, or similar agent configuration +3. **Understand the repo's purpose** - Read README, package.json, main source files to understand what the project does +4. **Identify natural agent roles** - Based on the repo's structure and purpose, determine what agents would be useful + +## Handling Existing Skills + +Many repos already contain skills (SKILL.md files). When you find them: + +**Default behavior: use references, not copies.** + +Instead of copying skill content into your company package, create a source reference: + +```yaml +metadata: + sources: + - kind: github-file + repo: owner/repo + path: path/to/SKILL.md + commit: + attribution: + license: + usage: referenced +``` + +To get the commit SHA: +```bash +git ls-remote https://github.com/owner/repo HEAD +``` + +Only vendor (copy) skills when: +- The user explicitly asks to copy them +- The skill is very small and tightly coupled to the company +- The source repo is private or may become unavailable + +## Handling Existing Agent Configurations + +If the repo has agent configs (CLAUDE.md, .claude/ directories, codex configs, etc.): +- Use them as inspiration for AGENTS.md instructions +- Don't copy them verbatim - adapt them to the Agent Companies format +- Preserve the intent and key instructions + +## Repo-Only Skills (No Agents) + +When a repo contains only skills and no agents: +- Create agents that would naturally use those skills +- The agents should be minimal - just enough to give the skills a runtime context +- A single agent may use multiple skills from the repo +- Name agents based on the domain the skills cover + +Example: A repo with `code-review`, `testing`, and `deployment` skills might become: +- A "Lead Engineer" agent with all three skills +- Or separate "Reviewer", "QA Engineer", and "DevOps" agents if the skills are distinct enough + +## Common Repo Patterns + +### Developer Tools / CLI repos +- Create agents for the tool's primary use cases +- Reference any existing skills +- Add a project maintainer or lead agent + +### Library / Framework repos +- Create agents for development, testing, documentation +- Skills from the repo become agent capabilities + +### Full Application repos +- Map to departments: engineering, product, QA +- Create a lean team structure appropriate to the project size + +### Skills Collection repos (e.g. skills.sh repos) +- Each skill or skill group gets an agent +- Create a lightweight company or team wrapper +- Keep the agent count proportional to the skill diversity diff --git a/.agents/skills/create-agent-adapter/SKILL.md b/.agents/skills/create-agent-adapter/SKILL.md new file mode 100644 index 0000000..dcd6456 --- /dev/null +++ b/.agents/skills/create-agent-adapter/SKILL.md @@ -0,0 +1,718 @@ +--- +name: create-agent-adapter +description: > + Technical guide for creating a new Paperclip agent adapter. Use when building + a new adapter package, adding support for a new AI coding tool (e.g. a new + CLI agent, API-based agent, or custom process), or when modifying the adapter + system. Covers the required interfaces, module structure, registration points, + and conventions derived from the existing claude-local and codex-local adapters. +--- + +# Creating a Paperclip Agent Adapter + +An adapter bridges Paperclip's orchestration layer to a specific AI agent runtime (Claude Code, Codex CLI, a custom process, an HTTP endpoint, etc.). Each adapter is a self-contained package that provides implementations for **three consumers**: the server, the UI, and the CLI. + +--- + +## 1. Architecture Overview + +``` +packages/adapters// + src/ + index.ts # Shared metadata (type, label, models, agentConfigurationDoc) + server/ + index.ts # Server exports: execute, sessionCodec, parse helpers + execute.ts # Core execution logic (AdapterExecutionContext -> AdapterExecutionResult) + parse.ts # Stdout/result parsing for the agent's output format + ui/ + index.ts # UI exports: parseStdoutLine, buildConfig + parse-stdout.ts # Line-by-line stdout -> TranscriptEntry[] for the run viewer + build-config.ts # CreateConfigValues -> adapterConfig JSON for agent creation form + cli/ + index.ts # CLI exports: formatStdoutEvent + format-event.ts # Colored terminal output for `paperclipai run --watch` + package.json + tsconfig.json +``` + +Three separate registries consume adapter modules: + +| Registry | Location | Interface | +|----------|----------|-----------| +| Server | `server/src/adapters/registry.ts` | `ServerAdapterModule` | +| UI | `ui/src/adapters/registry.ts` | `UIAdapterModule` | +| CLI | `cli/src/adapters/registry.ts` | `CLIAdapterModule` | + +--- + +## 2. Shared Types (`@paperclipai/adapter-utils`) + +All adapter interfaces live in `packages/adapter-utils/src/types.ts`. Import from `@paperclipai/adapter-utils` (types) or `@paperclipai/adapter-utils/server-utils` (runtime helpers). + +### Core Interfaces + +```ts +// The execute function signature — every adapter must implement this +interface AdapterExecutionContext { + runId: string; + agent: AdapterAgent; // { id, companyId, name, adapterType, adapterConfig } + runtime: AdapterRuntime; // { sessionId, sessionParams, sessionDisplayId, taskKey } + config: Record; // The agent's adapterConfig blob + context: Record; // Runtime context (taskId, wakeReason, approvalId, etc.) + onLog: (stream: "stdout" | "stderr", chunk: string) => Promise; + onMeta?: (meta: AdapterInvocationMeta) => Promise; + authToken?: string; +} + +interface AdapterExecutionResult { + exitCode: number | null; + signal: string | null; + timedOut: boolean; + errorMessage?: string | null; + usage?: UsageSummary; // { inputTokens, outputTokens, cachedInputTokens? } + sessionId?: string | null; // Legacy — prefer sessionParams + sessionParams?: Record | null; // Opaque session state persisted between runs + sessionDisplayId?: string | null; + provider?: string | null; // "anthropic", "openai", etc. + model?: string | null; + costUsd?: number | null; + resultJson?: Record | null; + summary?: string | null; // Human-readable summary of what the agent did + clearSession?: boolean; // true = tell Paperclip to forget the stored session +} + +interface AdapterSessionCodec { + deserialize(raw: unknown): Record | null; + serialize(params: Record | null): Record | null; + getDisplayId?(params: Record | null): string | null; +} +``` + +### Module Interfaces + +```ts +// Server — registered in server/src/adapters/registry.ts +interface ServerAdapterModule { + type: string; + execute(ctx: AdapterExecutionContext): Promise; + testEnvironment(ctx: AdapterEnvironmentTestContext): Promise; + sessionCodec?: AdapterSessionCodec; + supportsLocalAgentJwt?: boolean; + models?: { id: string; label: string }[]; + agentConfigurationDoc?: string; +} + +// UI — registered in ui/src/adapters/registry.ts +interface UIAdapterModule { + type: string; + label: string; + parseStdoutLine: (line: string, ts: string) => TranscriptEntry[]; + ConfigFields: ComponentType; + buildAdapterConfig: (values: CreateConfigValues) => Record; +} + +// CLI — registered in cli/src/adapters/registry.ts +interface CLIAdapterModule { + type: string; + formatStdoutEvent: (line: string, debug: boolean) => void; +} +``` + +--- + +## 2.1 Adapter Environment Test Contract + +Every server adapter must implement `testEnvironment(...)`. This powers the board UI "Test environment" button in agent configuration. + +```ts +type AdapterEnvironmentCheckLevel = "info" | "warn" | "error"; +type AdapterEnvironmentTestStatus = "pass" | "warn" | "fail"; + +interface AdapterEnvironmentCheck { + code: string; + level: AdapterEnvironmentCheckLevel; + message: string; + detail?: string | null; + hint?: string | null; +} + +interface AdapterEnvironmentTestResult { + adapterType: string; + status: AdapterEnvironmentTestStatus; + checks: AdapterEnvironmentCheck[]; + testedAt: string; // ISO timestamp +} + +interface AdapterEnvironmentTestContext { + companyId: string; + adapterType: string; + config: Record; // runtime-resolved adapterConfig +} +``` + +Guidelines: + +- Return structured diagnostics, never throw for expected findings. +- Use `error` for invalid/unusable runtime setup (bad cwd, missing command, invalid URL). +- Use `warn` for non-blocking but important situations. +- Use `info` for successful checks and context. + +Severity policy is product-critical: warnings are not save blockers. +Example: for `claude_local`, detected `ANTHROPIC_API_KEY` must be a `warn`, not an `error`, because Claude can still run (it just uses API-key auth instead of subscription auth). + +--- + +## 3. Step-by-Step: Creating a New Adapter + +### 3.1 Create the Package + +``` +packages/adapters// + package.json + tsconfig.json + src/ + index.ts + server/index.ts + server/execute.ts + server/parse.ts + ui/index.ts + ui/parse-stdout.ts + ui/build-config.ts + cli/index.ts + cli/format-event.ts +``` + +**package.json** — must use the four-export convention: + +```json +{ + "name": "@paperclipai/adapter-", + "version": "0.0.1", + "private": true, + "type": "module", + "exports": { + ".": "./src/index.ts", + "./server": "./src/server/index.ts", + "./ui": "./src/ui/index.ts", + "./cli": "./src/cli/index.ts" + }, + "dependencies": { + "@paperclipai/adapter-utils": "workspace:*", + "picocolors": "^1.1.1" + }, + "devDependencies": { + "typescript": "^5.7.3" + } +} +``` + +### 3.2 Root `index.ts` — Adapter Metadata + +This file is imported by **all three** consumers (server, UI, CLI). Keep it dependency-free (no Node APIs, no React). + +```ts +export const type = "my_agent"; // snake_case, globally unique +export const label = "My Agent (local)"; + +export const models = [ + { id: "model-a", label: "Model A" }, + { id: "model-b", label: "Model B" }, +]; + +export const agentConfigurationDoc = `# my_agent agent configuration +...document all config fields here... +`; +``` + +**Required exports:** +- `type` — the adapter type key, stored in `agents.adapter_type` +- `label` — human-readable name for the UI +- `models` — available model options for the agent creation form +- `agentConfigurationDoc` — markdown describing all `adapterConfig` fields (used by LLM agents configuring other agents) + +**Writing `agentConfigurationDoc` as routing logic:** + +The `agentConfigurationDoc` is read by LLM agents (including Paperclip agents that create other agents). Write it as **routing logic**, not marketing copy. Include concrete "use when" and "don't use when" guidance so an LLM can decide whether this adapter is appropriate for a given task. + +```ts +export const agentConfigurationDoc = `# my_agent agent configuration + +Adapter: my_agent + +Use when: +- The agent needs to run MyAgent CLI locally on the host machine +- You need session persistence across runs (MyAgent supports thread resumption) +- The task requires MyAgent-specific tools (e.g. web search, code execution) + +Don't use when: +- You need a simple one-shot script execution (use the "process" adapter instead) +- The agent doesn't need conversational context between runs (process adapter is simpler) +- MyAgent CLI is not installed on the host + +Core fields: +- cwd (string, required): absolute working directory for the agent process +... +`; +``` + +Adding explicit negative cases improves adapter selection accuracy. One concrete anti-pattern is worth more than three paragraphs of description. + +### 3.3 Server Module + +#### `server/execute.ts` — The Core + +This is the most important file. It receives an `AdapterExecutionContext` and must return an `AdapterExecutionResult`. + +**Required behavior:** + +1. **Read config** — extract typed values from `ctx.config` using helpers (`asString`, `asNumber`, `asBoolean`, `asStringArray`, `parseObject` from `@paperclipai/adapter-utils/server-utils`) +2. **Build environment** — call `buildPaperclipEnv(agent)` then layer in `PAPERCLIP_RUN_ID`, context vars (`PAPERCLIP_TASK_ID`, `PAPERCLIP_WAKE_REASON`, `PAPERCLIP_WAKE_COMMENT_ID`, `PAPERCLIP_APPROVAL_ID`, `PAPERCLIP_APPROVAL_STATUS`, `PAPERCLIP_LINKED_ISSUE_IDS`), user env overrides, and auth token +3. **Resolve session** — check `runtime.sessionParams` / `runtime.sessionId` for an existing session; validate it's compatible (e.g. same cwd); decide whether to resume or start fresh +4. **Render prompt** — use `renderTemplate(template, data)` with the template variables: `agentId`, `companyId`, `runId`, `company`, `agent`, `run`, `context` +5. **Call onMeta** — emit adapter invocation metadata before spawning the process +6. **Spawn the process** — use `runChildProcess()` for CLI-based agents or `fetch()` for HTTP-based agents +7. **Parse output** — convert the agent's stdout into structured data (session id, usage, summary, errors) +8. **Handle session errors** — if resume fails with "unknown session", retry with a fresh session and set `clearSession: true` +9. **Return AdapterExecutionResult** — populate all fields the agent runtime supports + +**Environment variables the server always injects:** + +| Variable | Source | +|----------|--------| +| `PAPERCLIP_AGENT_ID` | `agent.id` | +| `PAPERCLIP_COMPANY_ID` | `agent.companyId` | +| `PAPERCLIP_API_URL` | Server's own URL | +| `PAPERCLIP_RUN_ID` | Current run id | +| `PAPERCLIP_TASK_ID` | `context.taskId` or `context.issueId` | +| `PAPERCLIP_WAKE_REASON` | `context.wakeReason` | +| `PAPERCLIP_WAKE_COMMENT_ID` | `context.wakeCommentId` or `context.commentId` | +| `PAPERCLIP_APPROVAL_ID` | `context.approvalId` | +| `PAPERCLIP_APPROVAL_STATUS` | `context.approvalStatus` | +| `PAPERCLIP_LINKED_ISSUE_IDS` | `context.issueIds` (comma-separated) | +| `PAPERCLIP_API_KEY` | `authToken` (if no explicit key in config) | + +#### `server/parse.ts` — Output Parser + +Parse the agent's stdout format into structured data. Must handle: + +- **Session identification** — extract session/thread ID from init events +- **Usage tracking** — extract token counts (input, output, cached) +- **Cost tracking** — extract cost if available +- **Summary extraction** — pull the agent's final text response +- **Error detection** — identify error states, extract error messages +- **Unknown session detection** — export an `isUnknownSessionError()` function for retry logic + +**Treat agent output as untrusted.** The stdout you're parsing comes from an LLM-driven process that may have executed arbitrary tool calls, fetched external content, or been influenced by prompt injection in the files it read. Parse defensively: +- Never `eval()` or dynamically execute anything from output +- Use safe extraction helpers (`asString`, `asNumber`, `parseJson`) — they return fallbacks on unexpected types +- Validate session IDs and other structured data before passing them through +- If output contains URLs, file paths, or commands, do not act on them in the adapter — just record them + +#### `server/index.ts` — Server Exports + +```ts +export { execute } from "./execute.js"; +export { testEnvironment } from "./test.js"; +export { parseMyAgentOutput, isMyAgentUnknownSessionError } from "./parse.js"; + +// Session codec — required for session persistence +export const sessionCodec: AdapterSessionCodec = { + deserialize(raw) { /* raw DB JSON -> typed params or null */ }, + serialize(params) { /* typed params -> JSON for DB storage */ }, + getDisplayId(params) { /* -> human-readable session id string */ }, +}; +``` + +#### `server/test.ts` — Environment Diagnostics + +Implement adapter-specific preflight checks used by the UI test button. + +Minimum expectations: + +1. Validate required config primitives (paths, commands, URLs, auth assumptions) +2. Return check objects with deterministic `code` values +3. Map severity consistently (`info` / `warn` / `error`) +4. Compute final status: + - `fail` if any `error` + - `warn` if no errors and at least one warning + - `pass` otherwise + +This operation should be lightweight and side-effect free. + +### 3.4 UI Module + +#### `ui/parse-stdout.ts` — Transcript Parser + +Converts individual stdout lines into `TranscriptEntry[]` for the run detail viewer. Must handle the agent's streaming output format and produce entries of these kinds: + +- `init` — model/session initialization +- `assistant` — agent text responses +- `thinking` — agent thinking/reasoning (if supported) +- `tool_call` — tool invocations with name and input +- `tool_result` — tool results with content and error flag +- `user` — user messages in the conversation +- `result` — final result with usage stats +- `stdout` — fallback for unparseable lines + +```ts +export function parseMyAgentStdoutLine(line: string, ts: string): TranscriptEntry[] { + // Parse JSON line, map to appropriate TranscriptEntry kind(s) + // Return [{ kind: "stdout", ts, text: line }] as fallback +} +``` + +#### `ui/build-config.ts` — Config Builder + +Converts the UI form's `CreateConfigValues` into the `adapterConfig` JSON blob stored on the agent. + +```ts +export function buildMyAgentConfig(v: CreateConfigValues): Record { + const ac: Record = {}; + if (v.cwd) ac.cwd = v.cwd; + if (v.promptTemplate) ac.promptTemplate = v.promptTemplate; + if (v.model) ac.model = v.model; + ac.timeoutSec = 0; + ac.graceSec = 15; + // ... adapter-specific fields + return ac; +} +``` + +#### UI Config Fields Component + +Create `ui/src/adapters//config-fields.tsx` with a React component implementing `AdapterConfigFieldsProps`. This renders adapter-specific form fields in the agent creation/edit form. + +Use the shared primitives from `ui/src/components/agent-config-primitives`: +- `Field` — labeled form field wrapper +- `ToggleField` — boolean toggle with label and hint +- `DraftInput` — text input with draft/commit behavior +- `DraftNumberInput` — number input with draft/commit behavior +- `help` — standard hint text for common fields + +The component must support both `create` mode (using `values`/`set`) and `edit` mode (using `config`/`eff`/`mark`). + +### 3.5 CLI Module + +#### `cli/format-event.ts` — Terminal Formatter + +Pretty-prints stdout lines for `paperclipai run --watch`. Use `picocolors` for coloring. + +```ts +import pc from "picocolors"; + +export function printMyAgentStreamEvent(raw: string, debug: boolean): void { + // Parse JSON line from agent stdout + // Print colored output: blue for system, green for assistant, yellow for tools + // In debug mode, print unrecognized lines in gray +} +``` + +--- + +## 4. Registration Checklist + +After creating the adapter package, register it in all three consumers: + +### 4.1 Server Registry (`server/src/adapters/registry.ts`) + +```ts +import { execute as myExecute, sessionCodec as mySessionCodec } from "@paperclipai/adapter-my-agent/server"; +import { agentConfigurationDoc as myDoc, models as myModels } from "@paperclipai/adapter-my-agent"; + +const myAgentAdapter: ServerAdapterModule = { + type: "my_agent", + execute: myExecute, + sessionCodec: mySessionCodec, + models: myModels, + supportsLocalAgentJwt: true, // true if agent can use Paperclip API + agentConfigurationDoc: myDoc, +}; + +// Add to the adaptersByType map +const adaptersByType = new Map( + [..., myAgentAdapter].map((a) => [a.type, a]), +); +``` + +### 4.2 UI Registry (`ui/src/adapters/registry.ts`) + +```ts +import { myAgentUIAdapter } from "./my-agent"; + +const adaptersByType = new Map( + [..., myAgentUIAdapter].map((a) => [a.type, a]), +); +``` + +With `ui/src/adapters/my-agent/index.ts`: + +```ts +import type { UIAdapterModule } from "../types"; +import { parseMyAgentStdoutLine } from "@paperclipai/adapter-my-agent/ui"; +import { MyAgentConfigFields } from "./config-fields"; +import { buildMyAgentConfig } from "@paperclipai/adapter-my-agent/ui"; + +export const myAgentUIAdapter: UIAdapterModule = { + type: "my_agent", + label: "My Agent", + parseStdoutLine: parseMyAgentStdoutLine, + ConfigFields: MyAgentConfigFields, + buildAdapterConfig: buildMyAgentConfig, +}; +``` + +### 4.3 CLI Registry (`cli/src/adapters/registry.ts`) + +```ts +import { printMyAgentStreamEvent } from "@paperclipai/adapter-my-agent/cli"; + +const myAgentCLIAdapter: CLIAdapterModule = { + type: "my_agent", + formatStdoutEvent: printMyAgentStreamEvent, +}; + +// Add to the adaptersByType map +``` + +--- + +## 5. Session Management — Designing for Long Runs + +Sessions allow agents to maintain conversation context across runs. The system is **codec-based** — each adapter defines how to serialize/deserialize its session state. + +**Design for long runs from the start.** Treat session reuse as the default primitive, not an optimization to add later. An agent working on an issue may be woken dozens of times — for the initial assignment, approval callbacks, re-assignments, manual nudges. Each wake should resume the existing conversation so the agent retains full context about what it has already done, what files it has read, and what decisions it has made. Starting fresh each time wastes tokens on re-reading the same files and risks contradictory decisions. + +**Key concepts:** +- `sessionParams` is an opaque `Record` stored in the DB per task +- The adapter's `sessionCodec.serialize()` converts execution result data to storable params +- `sessionCodec.deserialize()` converts stored params back for the next run +- `sessionCodec.getDisplayId()` extracts a human-readable session ID for the UI +- **cwd-aware resume**: if the session was created in a different cwd than the current config, skip resuming (prevents cross-project session contamination) +- **Unknown session retry**: if resume fails with a "session not found" error, retry with a fresh session and return `clearSession: true` so Paperclip wipes the stale session + +If the agent runtime supports any form of context compaction or conversation compression (e.g. Claude Code's automatic context management, or Codex's `previous_response_id` chaining), lean on it. Adapters that support session resume get compaction for free — the agent runtime handles context window management internally across resumes. + +**Pattern** (from both claude-local and codex-local): + +```ts +const canResumeSession = + runtimeSessionId.length > 0 && + (runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(cwd)); +const sessionId = canResumeSession ? runtimeSessionId : null; + +// ... run attempt ... + +// If resume failed with unknown session, retry fresh +if (sessionId && !proc.timedOut && exitCode !== 0 && isUnknownSessionError(output)) { + const retry = await runAttempt(null); + return toResult(retry, { clearSessionOnMissingSession: true }); +} +``` + +--- + +## 6. Server-Utils Helpers + +Import from `@paperclipai/adapter-utils/server-utils`: + +| Helper | Purpose | +|--------|---------| +| `asString(val, fallback)` | Safe string extraction | +| `asNumber(val, fallback)` | Safe number extraction | +| `asBoolean(val, fallback)` | Safe boolean extraction | +| `asStringArray(val)` | Safe string array extraction | +| `parseObject(val)` | Safe `Record` extraction | +| `parseJson(str)` | Safe JSON.parse returning `Record` or null | +| `renderTemplate(tmpl, data)` | `{{path.to.value}}` template rendering | +| `buildPaperclipEnv(agent)` | Standard `PAPERCLIP_*` env vars | +| `redactEnvForLogs(env)` | Redact sensitive keys for onMeta | +| `ensureAbsoluteDirectory(cwd)` | Validate cwd exists and is absolute | +| `ensureCommandResolvable(cmd, cwd, env)` | Validate command is in PATH | +| `ensurePathInEnv(env)` | Ensure PATH exists in env | +| `runChildProcess(runId, cmd, args, opts)` | Spawn with timeout, logging, capture | + +--- + +## 7. Conventions and Patterns + +### Naming +- Adapter type: `snake_case` (e.g. `claude_local`, `codex_local`) +- Package name: `@paperclipai/adapter-` +- Package directory: `packages/adapters//` + +### Config Parsing +- Never trust `config` values directly — always use `asString`, `asNumber`, etc. +- Provide sensible defaults for every optional field +- Document all fields in `agentConfigurationDoc` + +### Prompt Templates +- Support `promptTemplate` for every run +- Use `renderTemplate()` with the standard variable set +- Default prompt: `"You are agent {{agent.id}} ({{agent.name}}). Continue your Paperclip work."` + +### Error Handling +- Differentiate timeout vs process error vs parse failure +- Always populate `errorMessage` on failure +- Include raw stdout/stderr in `resultJson` when parsing fails +- Handle the agent CLI not being installed (command not found) + +### Logging +- Call `onLog("stdout", ...)` and `onLog("stderr", ...)` for all process output — this feeds the real-time run viewer +- Call `onMeta(...)` before spawning to record invocation details +- Use `redactEnvForLogs()` when including env in meta + +### Paperclip Skills Injection + +Paperclip ships shared skills (in the repo's top-level `skills/` directory) that agents need at runtime — things like the `paperclip` API skill and the `paperclip-create-agent` workflow skill. Each adapter is responsible for making these skills discoverable by its agent runtime **without polluting the agent's working directory**. + +**The constraint:** never copy or symlink skills into the agent's `cwd`. The cwd is the user's project checkout — writing `.claude/skills/` or any other files into it would contaminate the repo with Paperclip internals, break git status, and potentially leak into commits. + +**The pattern:** create a clean, isolated location for skills and tell the agent runtime to look there. + +**How claude-local does it:** + +1. At execution time, create a fresh tmpdir: `mkdtemp("paperclip-skills-")` +2. Inside it, create `.claude/skills/` (the directory structure Claude Code expects) +3. Symlink each skill directory from the repo's `skills/` into the tmpdir's `.claude/skills/` +4. Pass the tmpdir to Claude Code via `--add-dir ` — this makes Claude Code discover the skills as if they were registered in that directory, without touching the agent's actual cwd +5. Clean up the tmpdir in a `finally` block after the run completes + +```ts +// From claude-local execute.ts +async function buildSkillsDir(): Promise { + const tmp = await fs.mkdtemp(path.join(os.tmpdir(), "paperclip-skills-")); + const target = path.join(tmp, ".claude", "skills"); + await fs.mkdir(target, { recursive: true }); + const entries = await fs.readdir(PAPERCLIP_SKILLS_DIR, { withFileTypes: true }); + for (const entry of entries) { + if (entry.isDirectory()) { + await fs.symlink( + path.join(PAPERCLIP_SKILLS_DIR, entry.name), + path.join(target, entry.name), + ); + } + } + return tmp; +} + +// In execute(): pass --add-dir to Claude Code +const skillsDir = await buildSkillsDir(); +args.push("--add-dir", skillsDir); +// ... run process ... +// In finally: fs.rm(skillsDir, { recursive: true, force: true }) +``` + +**How codex-local does it:** + +Codex has a global personal skills directory (`$CODEX_HOME/skills` or `~/.codex/skills`). The adapter symlinks Paperclip skills there if they don't already exist. This is acceptable because it's the agent tool's own config directory, not the user's project. + +```ts +// From codex-local execute.ts +async function ensureCodexSkillsInjected(onLog) { + const skillsHome = path.join(codexHomeDir(), "skills"); + await fs.mkdir(skillsHome, { recursive: true }); + for (const entry of entries) { + const target = path.join(skillsHome, entry.name); + const existing = await fs.lstat(target).catch(() => null); + if (existing) continue; // Don't overwrite user's own skills + await fs.symlink(source, target); + } +} +``` + +**For a new adapter:** figure out how your agent runtime discovers skills/plugins, then choose the cleanest injection path: + +1. **Best: tmpdir + flag** (like claude-local) — if the runtime supports an "additional directory" flag, create a tmpdir, symlink skills in, pass the flag, clean up after. Zero side effects. +2. **Acceptable: global config dir** (like codex-local) — if the runtime has a global skills/plugins directory separate from the project, symlink there. Skip existing entries to avoid overwriting user customizations. +3. **Acceptable: env var** — if the runtime reads a skills/plugin path from an environment variable, point it at the repo's `skills/` directory directly. +4. **Last resort: prompt injection** — if the runtime has no plugin system, include skill content in the prompt template itself. This uses tokens but avoids filesystem side effects entirely. + +**Skills as loaded procedures, not prompt bloat.** The Paperclip skills (like `paperclip` and `paperclip-create-agent`) are designed as on-demand procedures: the agent sees skill metadata (name + description) in its context, but only loads the full SKILL.md content when it decides to invoke a skill. This keeps the base prompt small. When writing `agentConfigurationDoc` or prompt templates for your adapter, do not inline skill content — let the agent runtime's skill discovery do the work. The descriptions in each SKILL.md frontmatter act as routing logic: they tell the agent when to load the full skill, not what the skill contains. + +**Explicit vs. fuzzy skill invocation.** For production workflows where reliability matters (e.g. an agent that must always call the Paperclip API to report status), use explicit instructions in the prompt template: "Use the paperclip skill to report your progress." Fuzzy routing (letting the model decide based on description matching) is fine for exploratory tasks but unreliable for mandatory procedures. + +--- + +## 8. Security Considerations + +Adapters sit at the boundary between Paperclip's orchestration layer and arbitrary agent execution. This is a high-risk surface. + +### Treat Agent Output as Untrusted + +The agent process runs LLM-driven code that reads external files, fetches URLs, and executes tools. Its output may be influenced by prompt injection from the content it processes. The adapter's parse layer is a trust boundary — validate everything, execute nothing. + +### Secret Injection via Environment, Not Prompts + +Never put secrets (API keys, tokens) into prompt templates or config fields that flow through the LLM. Instead, inject them as environment variables that the agent's tools can read directly: + +- `PAPERCLIP_API_KEY` is injected by the server into the process environment, not the prompt +- User-provided secrets in `config.env` are passed as env vars, redacted in `onMeta` logs +- The `redactEnvForLogs()` helper automatically masks any key matching `/(key|token|secret|password|authorization|cookie)/i` + +This follows the "sidecar injection" pattern: the model never sees the real secret value, but the tools it invokes can read it from the environment. + +### Network Access + +If your agent runtime supports network access controls (sandboxing, allowlists), configure them in the adapter: + +- Prefer minimal allowlists over open internet access. An agent that only needs to call the Paperclip API and GitHub should not have access to arbitrary hosts. +- Skills + network = amplified risk. A skill that teaches the agent to make HTTP requests combined with unrestricted network access creates an exfiltration path. Constrain one or the other. +- If the runtime supports layered policies (org-level defaults + per-request overrides), wire the org-level policy into the adapter config and let per-agent config narrow further. + +### Process Isolation + +- CLI-based adapters inherit the server's user permissions. The `cwd` and `env` config determine what the agent process can access on the filesystem. +- `dangerouslySkipPermissions` / `dangerouslyBypassApprovalsAndSandbox` flags exist for development convenience but must be documented as dangerous in `agentConfigurationDoc`. Production deployments should not use them. +- Timeout and grace period (`timeoutSec`, `graceSec`) are safety rails — always enforce them. A runaway agent process without a timeout can consume unbounded resources. + +--- + +## 9. TranscriptEntry Kinds Reference + +The UI run viewer displays these entry kinds: + +| Kind | Fields | Usage | +|------|--------|-------| +| `init` | `model`, `sessionId` | Agent initialization | +| `assistant` | `text` | Agent text response | +| `thinking` | `text` | Agent reasoning/thinking | +| `user` | `text` | User message | +| `tool_call` | `name`, `input` | Tool invocation | +| `tool_result` | `toolUseId`, `content`, `isError` | Tool result | +| `result` | `text`, `inputTokens`, `outputTokens`, `cachedTokens`, `costUsd`, `subtype`, `isError`, `errors` | Final result with usage | +| `stderr` | `text` | Stderr output | +| `system` | `text` | System messages | +| `stdout` | `text` | Raw stdout fallback | + +--- + +## 10. Testing + +Create tests in `server/src/__tests__/-adapter.test.ts`. Test: + +1. **Output parsing** — feed sample stdout through your parser, verify structured output +2. **Unknown session detection** — verify the `isUnknownSessionError` function +3. **Config building** — verify `buildConfig` produces correct adapterConfig from form values +4. **Session codec** — verify serialize/deserialize round-trips + +--- + +## 11. Minimal Adapter Checklist + +- [ ] `packages/adapters//package.json` with four exports (`.`, `./server`, `./ui`, `./cli`) +- [ ] Root `index.ts` with `type`, `label`, `models`, `agentConfigurationDoc` +- [ ] `server/execute.ts` implementing `AdapterExecutionContext -> AdapterExecutionResult` +- [ ] `server/test.ts` implementing `AdapterEnvironmentTestContext -> AdapterEnvironmentTestResult` +- [ ] `server/parse.ts` with output parser and unknown-session detector +- [ ] `server/index.ts` exporting `execute`, `testEnvironment`, `sessionCodec`, parse helpers +- [ ] `ui/parse-stdout.ts` with `StdoutLineParser` for the run viewer +- [ ] `ui/build-config.ts` with `CreateConfigValues -> adapterConfig` builder +- [ ] `ui/src/adapters//config-fields.tsx` React component for agent form +- [ ] `ui/src/adapters//index.ts` assembling the `UIAdapterModule` +- [ ] `cli/format-event.ts` with terminal formatter +- [ ] `cli/index.ts` exporting the formatter +- [ ] Registered in `server/src/adapters/registry.ts` +- [ ] Registered in `ui/src/adapters/registry.ts` +- [ ] Registered in `cli/src/adapters/registry.ts` +- [ ] Added to workspace in root `pnpm-workspace.yaml` (if not already covered by glob) +- [ ] Tests for parsing, session codec, and config building diff --git a/.agents/skills/deal-with-security-advisory/SKILL.md b/.agents/skills/deal-with-security-advisory/SKILL.md new file mode 100644 index 0000000..0ba1223 --- /dev/null +++ b/.agents/skills/deal-with-security-advisory/SKILL.md @@ -0,0 +1,230 @@ +--- +name: deal-with-security-advisory +description: > + Handle a GitHub Security Advisory response for Paperclip, including + confidential fix development in a temporary private fork, human coordination + on advisory-thread comments, CVE request, synchronized advisory publication, + and immediate security release steps. +--- + +# Security Vulnerability Response Instructions + +## ⚠️ CRITICAL: This is a security vulnerability. Everything about this process is confidential until the advisory is published. Do not mention the vulnerability details in any public commit message, PR title, branch name, or comment. Do not push anything to a public branch. Do not discuss specifics in any public channel. Assume anything on the public repo is visible to attackers who will exploit the window between disclosure and user upgrades. + +*** + +## Context + +A security vulnerability has been reported via GitHub Security Advisory: + +* **Advisory:** {{ghsaId}} (e.g. GHSA-x8hx-rhr2-9rf7) +* **Reporter:** {{reporterHandle}} +* **Severity:** {{severity}} +* **Notes:** {{notes}} + +*** + +## Step 0: Fetch the Advisory Details + +Pull the full advisory so you understand the vulnerability before doing anything else: + +``` +gh api repos/paperclipai/paperclip/security-advisories/{{ghsaId}} + +``` + +Read the `description`, `severity`, `cvss`, and `vulnerabilities` fields. Understand the attack vector before writing code. + +## Step 1: Acknowledge the Report + +⚠️ **This step requires a human.** The advisory thread does not have a comment API. Ask the human operator to post a comment on the private advisory thread acknowledging the report. Provide them this template: + +> Thanks for the report, @{{reporterHandle}}. We've confirmed the issue and are working on a fix. We're targeting a patch release within {{timeframe}}. We'll keep you updated here. + +Give your human this template, but still continue + +Below we use `gh` tools - you do have access and credentials outside of your sandbox, so use them. + +## Step 2: Create the Temporary Private Fork + +This is where all fix development happens. Never push to the public repo. + +``` +gh api --method POST \ + repos/paperclipai/paperclip/security-advisories/{{ghsaId}}/forks + +``` + +This returns a repository object for the private fork. Save the `full_name` and `clone_url`. + +Clone it and set up your workspace: + +``` +# Clone the private fork somewhere outside ~/paperclip +git clone ~/security-patch-{{ghsaId}} +cd ~/security-patch-{{ghsaId}} +git checkout -b security-fix + +``` + +**Do not edit `~/paperclip`** — the dev server is running off the `~/paperclip` master branch and we don't want to touch it. All work happens in the private fork clone. + +**TIPS:** + +* Do not commit `pnpm-lock.yaml` — the repo has actions to manage this +* Do not use descriptive branch names that leak the vulnerability (e.g., no `fix-dns-rebinding-rce`). Use something generic like `security-fix` +* All work stays in the private fork until publication +* CI/GitHub Actions will NOT run on the temporary private fork — this is a GitHub limitation by design. You must run tests locally + +## Step 3: Develop and Validate the Fix + +Write the patch. Same content standards as any PR: + +* It must functionally work — **run tests locally** since CI won't run on the private fork +* Consider the whole codebase, not just the narrow vulnerability path. A patch that fixes one vector but opens another is worse than no patch +* Ensure backwards compatibility for the database, or be explicit about what breaks +* Make sure any UI components still look correct if the fix touches them +* The fix should be minimal and focused — don't bundle unrelated changes into a security patch. Reviewers (and the reporter) should be able to read the diff and understand exactly what changed and why + +**Specific to security fixes:** + +* Verify the fix actually closes the attack vector described in the advisory. Reproduce the vulnerability first (using the reporter's description), then confirm the patch prevents it +* Consider adjacent attack vectors — if DNS rebinding is the issue, are there other endpoints or modes with the same class of problem? +* Do not introduce new dependencies unless absolutely necessary — new deps in a security patch raise eyebrows + +Push your fix to the private fork: + +``` +git add -A +git commit -m "Fix security vulnerability" +git push origin security-fix + +``` + +## Step 4: Coordinate with the Reporter + +⚠️ **This step requires a human.** Ask the human operator to post on the advisory thread letting the reporter know the fix is ready and giving them a chance to review. Provide them this template: + +> @{{reporterHandle}} — fix is ready in the private fork if you'd like to review before we publish. Planning to release within {{timeframe}}. + +Proceed + +## Step 5: Request a CVE + +This makes vulnerability scanners (npm audit, Snyk, Dependabot) warn users to upgrade. Without it, nobody gets automated notification. + +``` +gh api --method POST \ + repos/paperclipai/paperclip/security-advisories/{{ghsaId}}/cve + +``` + +GitHub is a CVE Numbering Authority and will assign one automatically. The CVE may take a few hours to propagate after the advisory is published. + +## Step 6: Publish Everything Simultaneously + +This all happens at once — do not stagger these steps. The goal is **zero window** between the vulnerability becoming public knowledge and the fix being available. + +### 6a. Verify reporter credit before publishing + +``` +gh api repos/paperclipai/paperclip/security-advisories/{{ghsaId}} --jq '.credits' + +``` + +If the reporter is not credited, add them: + +``` +gh api --method PATCH \ + repos/paperclipai/paperclip/security-advisories/{{ghsaId}} \ + --input - << 'EOF' +{ + "credits": [ + { + "login": "{{reporterHandle}}", + "type": "reporter" + } + ] +} +EOF + +``` + +### 6b. Update the advisory with the patched version and publish + +``` +gh api --method PATCH \ + repos/paperclipai/paperclip/security-advisories/{{ghsaId}} \ + --input - << 'EOF' +{ + "state": "published", + "vulnerabilities": [ + { + "package": { + "ecosystem": "npm", + "name": "paperclip" + }, + "vulnerable_version_range": "< {{patchedVersion}}", + "patched_versions": "{{patchedVersion}}" + } + ] +} +EOF + +``` + +Publishing the advisory simultaneously: + +* Makes the GHSA public +* Merges the temporary private fork into your repo +* Triggers the CVE assignment (if requested in step 5) + +### 6c. Cut a release immediately after merge + +``` +cd ~/paperclip +git pull origin master + +gh release create v{{patchedVersion}} \ + --repo paperclipai/paperclip \ + --title "v{{patchedVersion}} — Security Release" \ + --notes "## Security Release + +This release fixes a critical security vulnerability. + +### What was fixed +{{briefDescription}} (e.g., Remote code execution via DNS rebinding in \`local_trusted\` mode) + +### Advisory +https://github.com/paperclipai/paperclip/security/advisories/{{ghsaId}} + +### Credit +Thanks to @{{reporterHandle}} for responsibly disclosing this vulnerability. + +### Action required +All users running versions prior to {{patchedVersion}} should upgrade immediately." + +``` + +## Step 7: Post-Publication Verification + +``` +# Verify the advisory is published and CVE is assigned +gh api repos/paperclipai/paperclip/security-advisories/{{ghsaId}} \ + --jq '{state: .state, cve_id: .cve_id, published_at: .published_at}' + +# Verify the release exists +gh release view v{{patchedVersion}} --repo paperclipai/paperclip + +``` + +If the CVE hasn't been assigned yet, that's normal — it can take a few hours. + +⚠️ **Human step:** Ask the human operator to post a final comment on the advisory thread confirming publication and thanking the reporter. + +Tell the human operator what you did by posting a comment to this task, including: + +* The published advisory URL: `https://github.com/paperclipai/paperclip/security/advisories/{{ghsaId}}` +* The release URL +* Whether the CVE has been assigned yet +* All URLs to any pull requests or branches diff --git a/.agents/skills/doc-maintenance/SKILL.md b/.agents/skills/doc-maintenance/SKILL.md new file mode 100644 index 0000000..a597e90 --- /dev/null +++ b/.agents/skills/doc-maintenance/SKILL.md @@ -0,0 +1,201 @@ +--- +name: doc-maintenance +description: > + Audit top-level documentation (README, SPEC, PRODUCT) against recent git + history to find drift — shipped features missing from docs or features + listed as upcoming that already landed. Proposes minimal edits, creates + a branch, and opens a PR. Use when asked to review docs for accuracy, + after major feature merges, or on a periodic schedule. +--- + +# Doc Maintenance Skill + +Detect documentation drift and fix it via PR — no rewrites, no churn. + +## When to Use + +- Periodic doc review (e.g. weekly or after releases) +- After major feature merges +- When asked "are our docs up to date?" +- When asked to audit README / SPEC / PRODUCT accuracy + +## Target Documents + +| Document | Path | What matters | +|----------|------|-------------| +| README | `README.md` | Features table, roadmap, quickstart, "what is" accuracy, "works with" table | +| SPEC | `doc/SPEC.md` | No false "not supported" claims, major model/schema accuracy | +| PRODUCT | `doc/PRODUCT.md` | Core concepts, feature list, principles accuracy | + +Out of scope: DEVELOPING.md, DATABASE.md, CLI.md, doc/plans/, skill files, +release notes. These are dev-facing or ephemeral — lower risk of user-facing +confusion. + +## Workflow + +### Step 1 — Detect what changed + +Find the last review cursor: + +```bash +# Read the last-reviewed commit SHA +CURSOR_FILE=".doc-review-cursor" +if [ -f "$CURSOR_FILE" ]; then + LAST_SHA=$(cat "$CURSOR_FILE" | head -1) +else + # First run: look back 60 days + LAST_SHA=$(git log --format="%H" --after="60 days ago" --reverse | head -1) +fi +``` + +Then gather commits since the cursor: + +```bash +git log "$LAST_SHA"..HEAD --oneline --no-merges +``` + +### Step 2 — Classify changes + +Scan commit messages and changed files. Categorize into: + +- **Feature** — new capabilities (keywords: `feat`, `add`, `implement`, `support`) +- **Breaking** — removed/renamed things (keywords: `remove`, `breaking`, `drop`, `rename`) +- **Structural** — new directories, config changes, new adapters, new CLI commands + +**Ignore:** refactors, test-only changes, CI config, dependency bumps, doc-only +changes, style/formatting commits. These don't affect doc accuracy. + +For borderline cases, check the actual diff — a commit titled "refactor: X" +that adds a new public API is a feature. + +### Step 3 — Build a change summary + +Produce a concise list like: + +``` +Since last review (, ): +- FEATURE: Plugin system merged (runtime, SDK, CLI, slots, event bridge) +- FEATURE: Project archiving added +- BREAKING: Removed legacy webhook adapter +- STRUCTURAL: New .agents/skills/ directory convention +``` + +If there are no notable changes, skip to Step 7 (update cursor and exit). + +### Step 4 — Audit each target doc + +For each target document, read it fully and cross-reference against the change +summary. Check for: + +1. **False negatives** — major shipped features not mentioned at all +2. **False positives** — features listed as "coming soon" / "roadmap" / "planned" + / "not supported" / "TBD" that already shipped +3. **Quickstart accuracy** — install commands, prereqs, and startup instructions + still correct (README only) +4. **Feature table accuracy** — does the features section reflect current + capabilities? (README only) +5. **Works-with accuracy** — are supported adapters/integrations listed correctly? + +Use `references/audit-checklist.md` as the structured checklist. +Use `references/section-map.md` to know where to look for each feature area. + +### Step 5 — Create branch and apply minimal edits + +```bash +# Create a branch for the doc updates +BRANCH="docs/maintenance-$(date +%Y%m%d)" +git checkout -b "$BRANCH" +``` + +Apply **only** the edits needed to fix drift. Rules: + +- **Minimal patches only.** Fix inaccuracies, don't rewrite sections. +- **Preserve voice and style.** Match the existing tone of each document. +- **No cosmetic changes.** Don't fix typos, reformat tables, or reorganize + sections unless they're part of a factual fix. +- **No new sections.** If a feature needs a whole new section, note it in the + PR description as a follow-up — don't add it in a maintenance pass. +- **Roadmap items:** Move shipped features out of Roadmap. Add a brief mention + in the appropriate existing section if there isn't one already. Don't add + long descriptions. + +### Step 6 — Open a PR + +Commit the changes and open a PR: + +```bash +git add README.md doc/SPEC.md doc/PRODUCT.md .doc-review-cursor +git commit -m "docs: update documentation for accuracy + +- [list each fix briefly] + +Co-Authored-By: Paperclip " + +git push -u origin "$BRANCH" + +gh pr create \ + --title "docs: periodic documentation accuracy update" \ + --body "$(cat <<'EOF' +## Summary +Automated doc maintenance pass. Fixes documentation drift detected since +last review. + +### Changes +- [list each fix] + +### Change summary (since last review) +- [list notable code changes that triggered doc updates] + +## Review notes +- Only factual accuracy fixes — no style/cosmetic changes +- Preserves existing voice and structure +- Larger doc additions (new sections, tutorials) noted as follow-ups + +🤖 Generated by doc-maintenance skill +EOF +)" +``` + +### Step 7 — Update the cursor + +After a successful audit (whether or not edits were needed), update the cursor: + +```bash +git rev-parse HEAD > .doc-review-cursor +``` + +If edits were made, this is already committed in the PR branch. If no edits +were needed, commit the cursor update to the current branch. + +## Change Classification Rules + +| Signal | Category | Doc update needed? | +|--------|----------|-------------------| +| `feat:`, `add`, `implement`, `support` in message | Feature | Yes if user-facing | +| `remove`, `drop`, `breaking`, `!:` in message | Breaking | Yes | +| New top-level directory or config file | Structural | Maybe | +| `fix:`, `bugfix` | Fix | No (unless it changes behavior described in docs) | +| `refactor:`, `chore:`, `ci:`, `test:` | Maintenance | No | +| `docs:` | Doc change | No (already handled) | +| Dependency bumps only | Maintenance | No | + +## Patch Style Guide + +- Fix the fact, not the prose +- If removing a roadmap item, don't leave a gap — remove the bullet cleanly +- If adding a feature mention, match the format of surrounding entries + (e.g. if features are in a table, add a table row) +- Keep README changes especially minimal — it shouldn't churn often +- For SPEC/PRODUCT, prefer updating existing statements over adding new ones + (e.g. change "not supported in V1" to "supported via X" rather than adding + a new section) + +## Output + +When the skill completes, report: + +- How many commits were scanned +- How many notable changes were found +- How many doc edits were made (and to which files) +- PR link (if edits were made) +- Any follow-up items that need larger doc work diff --git a/.agents/skills/doc-maintenance/references/audit-checklist.md b/.agents/skills/doc-maintenance/references/audit-checklist.md new file mode 100644 index 0000000..9c13a43 --- /dev/null +++ b/.agents/skills/doc-maintenance/references/audit-checklist.md @@ -0,0 +1,85 @@ +# Doc Maintenance Audit Checklist + +Use this checklist when auditing each target document. For each item, compare +against the change summary from git history. + +## README.md + +### Features table +- [ ] Each feature card reflects a shipped capability +- [ ] No feature cards for things that don't exist yet +- [ ] No major shipped features missing from the table + +### Roadmap +- [ ] Nothing listed as "planned" or "coming soon" that already shipped +- [ ] No removed/cancelled items still listed +- [ ] Items reflect current priorities (cross-check with recent PRs) + +### Quickstart +- [ ] `npx paperclipai onboard` command is correct +- [ ] Manual install steps are accurate (clone URL, commands) +- [ ] Prerequisites (Node version, pnpm version) are current +- [ ] Server URL and port are correct + +### "What is Paperclip" section +- [ ] High-level description is accurate +- [ ] Step table (Define goal / Hire team / Approve and run) is correct + +### "Works with" table +- [ ] All supported adapters/runtimes are listed +- [ ] No removed adapters still listed +- [ ] Logos and labels match current adapter names + +### "Paperclip is right for you if" +- [ ] Use cases are still accurate +- [ ] No claims about capabilities that don't exist + +### "Why Paperclip is special" +- [ ] Technical claims are accurate (atomic execution, governance, etc.) +- [ ] No features listed that were removed or significantly changed + +### FAQ +- [ ] Answers are still correct +- [ ] No references to removed features or outdated behavior + +### Development section +- [ ] Commands are accurate (`pnpm dev`, `pnpm build`, etc.) +- [ ] Link to DEVELOPING.md is correct + +## doc/SPEC.md + +### Company Model +- [ ] Fields match current schema +- [ ] Governance model description is accurate + +### Agent Model +- [ ] Adapter types match what's actually supported +- [ ] Agent configuration description is accurate +- [ ] No features described as "not supported" or "not V1" that shipped + +### Task Model +- [ ] Task hierarchy description is accurate +- [ ] Status values match current implementation + +### Extensions / Plugins +- [ ] If plugins are shipped, no "not in V1" or "future" language +- [ ] Plugin model description matches implementation + +### Open Questions +- [ ] Resolved questions removed or updated +- [ ] No "TBD" items that have been decided + +## doc/PRODUCT.md + +### Core Concepts +- [ ] Company, Employees, Task Management descriptions accurate +- [ ] Agent Execution modes described correctly +- [ ] No missing major concepts + +### Principles +- [ ] Principles haven't been contradicted by shipped features +- [ ] No principles referencing removed capabilities + +### User Flow +- [ ] Dream scenario still reflects actual onboarding +- [ ] Steps are achievable with current features diff --git a/.agents/skills/doc-maintenance/references/section-map.md b/.agents/skills/doc-maintenance/references/section-map.md new file mode 100644 index 0000000..4ec64f8 --- /dev/null +++ b/.agents/skills/doc-maintenance/references/section-map.md @@ -0,0 +1,22 @@ +# Section Map + +Maps feature areas to specific document sections so the skill knows where to +look when a feature ships or changes. + +| Feature Area | README Section | SPEC Section | PRODUCT Section | +|-------------|---------------|-------------|----------------| +| Plugins / Extensions | Features table, Roadmap | Extensions, Agent Model | Core Concepts | +| Adapters (new runtimes) | "Works with" table, FAQ | Agent Model, Agent Configuration | Employees & Agents, Agent Execution | +| Governance / Approvals | Features table, "Why special" | Board Governance, Board Approval Gates | Principles | +| Budget / Cost Control | Features table, "Why special" | Budget Delegation | Company (revenue & expenses) | +| Task Management | Features table | Task Model | Task Management | +| Org Chart / Hierarchy | Features table | Agent Model (reporting) | Employees & Agents | +| Multi-Company | Features table, FAQ | Company Model | Company | +| Heartbeats | Features table, FAQ | Agent Execution | Agent Execution | +| CLI Commands | Development section | — | — | +| Onboarding / Quickstart | Quickstart, FAQ | — | User Flow | +| Skills / Skill Injection | "Why special" | — | — | +| Company Templates | "Why special", Roadmap (ClipMart) | — | — | +| Mobile / UI | Features table | — | — | +| Project Archiving | — | — | — | +| OpenClaw Integration | "Works with" table, FAQ | Agent Model | Agent Execution | diff --git a/.agents/skills/pr-report/SKILL.md b/.agents/skills/pr-report/SKILL.md new file mode 100644 index 0000000..5064b67 --- /dev/null +++ b/.agents/skills/pr-report/SKILL.md @@ -0,0 +1,202 @@ +--- +name: pr-report +description: > + Review a pull request or contribution deeply, explain it tutorial-style for a + maintainer, and produce a polished report artifact such as HTML or Markdown. + Use when asked to analyze a PR, explain a contributor's design decisions, + compare it with similar systems, or prepare a merge recommendation. +--- + +# PR Report Skill + +Produce a maintainer-grade review of a PR, branch, or large contribution. + +Default posture: + +- understand the change before judging it +- explain the system as built, not just the diff +- separate architectural problems from product-scope objections +- make a concrete recommendation, not a vague impression + +## When to Use + +Use this skill when the user asks for things like: + +- "review this PR deeply" +- "explain this contribution to me" +- "make me a report or webpage for this PR" +- "compare this design to similar systems" +- "should I merge this?" + +## Outputs + +Common outputs: + +- standalone HTML report in `tmp/reports/...` +- Markdown report in `report/` or another requested folder +- short maintainer summary in chat + +If the user asks for a webpage, build a polished standalone HTML artifact with +clear sections and readable visual hierarchy. + +Resources bundled with this skill: + +- `references/style-guide.md` for visual direction and report presentation rules +- `assets/html-report-starter.html` for a reusable standalone HTML/CSS starter + +## Workflow + +### 1. Acquire and frame the target + +Work from local code when possible, not just the GitHub PR page. + +Gather: + +- target branch or worktree +- diff size and changed subsystems +- relevant repo docs, specs, and invariants +- contributor intent if it is documented in PR text or design docs + +Start by answering: what is this change *trying* to become? + +### 2. Build a mental model of the system + +Do not stop at file-by-file notes. Reconstruct the design: + +- what new runtime or contract exists +- which layers changed: db, shared types, server, UI, CLI, docs +- lifecycle: install, startup, execution, UI, failure, disablement +- trust boundary: what code runs where, under what authority + +For large contributions, include a tutorial-style section that teaches the +system from first principles. + +### 3. Review like a maintainer + +Findings come first. Order by severity. + +Prioritize: + +- behavioral regressions +- trust or security gaps +- misleading abstractions +- lifecycle and operational risks +- coupling that will be hard to unwind +- missing tests or unverifiable claims + +Always cite concrete file references when possible. + +### 4. Distinguish the objection type + +Be explicit about whether a concern is: + +- product direction +- architecture +- implementation quality +- rollout strategy +- documentation honesty + +Do not hide an architectural objection inside a scope objection. + +### 5. Compare to external precedents when needed + +If the contribution introduces a framework or platform concept, compare it to +similar open-source systems. + +When comparing: + +- prefer official docs or source +- focus on extension boundaries, context passing, trust model, and UI ownership +- extract lessons, not just similarities + +Good comparison questions: + +- Who owns lifecycle? +- Who owns UI composition? +- Is context explicit or ambient? +- Are plugins trusted code or sandboxed code? +- Are extension points named and typed? + +### 6. Make the recommendation actionable + +Do not stop at "merge" or "do not merge." + +Choose one: + +- merge as-is +- merge after specific redesign +- salvage specific pieces +- keep as design research + +If rejecting or narrowing, say what should be kept. + +Useful recommendation buckets: + +- keep the protocol/type model +- redesign the UI boundary +- narrow the initial surface area +- defer third-party execution +- ship a host-owned extension-point model first + +### 7. Build the artifact + +Suggested report structure: + +1. Executive summary +2. What the PR actually adds +3. Tutorial: how the system works +4. Strengths +5. Main findings +6. Comparisons +7. Recommendation + +For HTML reports: + +- use intentional typography and color +- make navigation easy for long reports +- favor strong section headings and small reference labels +- avoid generic dashboard styling + +Before building from scratch, read `references/style-guide.md`. +If a fast polished starter is helpful, begin from `assets/html-report-starter.html` +and replace the placeholder content with the actual report. + +### 8. Verify before handoff + +Check: + +- artifact path exists +- findings still match the actual code +- any requested forbidden strings are absent from generated output +- if tests were not run, say so explicitly + +## Review Heuristics + +### Plugin and platform work + +Watch closely for: + +- docs claiming sandboxing while runtime executes trusted host processes +- module-global state used to smuggle React context +- hidden dependence on render order +- plugins reaching into host internals instead of using explicit APIs +- "capabilities" that are really policy labels on top of fully trusted code + +### Good signs + +- typed contracts shared across layers +- explicit extension points +- host-owned lifecycle +- honest trust model +- narrow first rollout with room to grow + +## Final Response + +In chat, summarize: + +- where the report is +- your overall call +- the top one or two reasons +- whether verification or tests were skipped + +Keep the chat summary shorter than the report itself. diff --git a/.agents/skills/pr-report/assets/html-report-starter.html b/.agents/skills/pr-report/assets/html-report-starter.html new file mode 100644 index 0000000..be6f055 --- /dev/null +++ b/.agents/skills/pr-report/assets/html-report-starter.html @@ -0,0 +1,426 @@ + + + + + + PR Report Starter + + + + + + +
+ + +
+
+
Executive Summary
+

Use the hero for the clearest one-line judgment.

+

+ Replace this with the short explanation of what the contribution does, why it matters, + and what the core maintainer question is. +

+
+ Strength + Tradeoff + Risk +
+
+
+
Overall Call
+
Placeholder
+
+
+
Main Concern
+
Placeholder
+
+
+
Best Part
+
Placeholder
+
+
+
Weakest Part
+
Placeholder
+
+
+
+ Use this block for the thesis, a sharp takeaway, or a key cited point. +
+
+ +
+

Tutorial Section

+
+
+

Concept Card

+

Use cards for mental models, subsystems, or comparison slices.

+
path/to/file.ts:10
+
+
+

Second Card

+

Keep cards fairly dense. This template is about style, not fixed structure.

+
path/to/file.ts:20
+
+
+
+ +
+

Findings

+
+
High
+

Finding Title

+

Use findings for the sharpest judgment calls and risks.

+
path/to/file.ts:30
+
+
+ +
+

Recommendation

+
+
+

Path Forward

+

Use this area for merge guidance, salvage plan, or rollout advice.

+
+
+

What To Keep

+

Call out the parts worth preserving even if the whole proposal should not land.

+
+
+
+
+
+ + diff --git a/.agents/skills/pr-report/references/style-guide.md b/.agents/skills/pr-report/references/style-guide.md new file mode 100644 index 0000000..35158d1 --- /dev/null +++ b/.agents/skills/pr-report/references/style-guide.md @@ -0,0 +1,149 @@ +# PR Report Style Guide + +Use this guide when the user wants a report artifact, especially a webpage. + +## Goal + +Make the report feel like an editorial review, not an internal admin dashboard. +The page should make a long technical argument easy to scan without looking +generic or overdesigned. + +## Visual Direction + +Preferred tone: + +- editorial +- warm +- serious +- high-contrast +- handcrafted, not corporate SaaS + +Avoid: + +- default app-shell layouts +- purple gradients on white +- generic card dashboards +- cramped pages with weak hierarchy +- novelty fonts that hurt readability + +## Typography + +Recommended pattern: + +- one expressive serif or display face for major headings +- one sturdy sans-serif for body copy and UI labels + +Good combinations: + +- Newsreader + IBM Plex Sans +- Source Serif 4 + Instrument Sans +- Fraunces + Public Sans +- Libre Baskerville + Work Sans + +Rules: + +- headings should feel deliberate and large +- body copy should stay comfortable for long reading +- reference labels and badges should use smaller dense sans text + +## Layout + +Recommended structure: + +- a sticky side or top navigation for long reports +- one strong hero summary at the top +- panel or paper-like sections for each major topic +- multi-column card grids for comparisons and strengths +- single-column body text for findings and recommendations + +Use generous spacing. Long-form technical reports need breathing room. + +## Color + +Prefer muted paper-like backgrounds with one warm accent and one cool counterweight. + +Suggested token categories: + +- `--bg` +- `--paper` +- `--ink` +- `--muted` +- `--line` +- `--accent` +- `--good` +- `--warn` +- `--bad` + +The accent should highlight navigation, badges, and important labels. Do not +let accent colors dominate body text. + +## Useful UI Elements + +Include small reusable styles for: + +- summary metrics +- badges +- quotes or callouts +- finding cards +- severity labels +- reference labels +- comparison cards +- responsive two-column sections + +## Motion + +Keep motion restrained. + +Good: + +- soft fade/slide-in on first load +- hover response on nav items or cards + +Bad: + +- constant animation +- floating blobs +- decorative motion with no reading benefit + +## Content Presentation + +Even when the user wants design polish, clarity stays primary. + +Good structure for long reports: + +1. executive summary +2. what changed +3. tutorial explanation +4. strengths +5. findings +6. comparisons +7. recommendation + +The exact headings can change. The important thing is to separate explanation +from judgment. + +## References + +Reference labels should be visually quiet but easy to spot. + +Good pattern: + +- small muted text +- monospace or compact sans +- keep them close to the paragraph they support + +## Starter Usage + +If you need a fast polished base, start from: + +- `assets/html-report-starter.html` + +Customize: + +- fonts +- color tokens +- hero copy +- section ordering +- card density + +Do not preserve the placeholder sections if they do not fit the actual report. diff --git a/.agents/skills/prcheckloop/SKILL.md b/.agents/skills/prcheckloop/SKILL.md new file mode 100644 index 0000000..70d9e19 --- /dev/null +++ b/.agents/skills/prcheckloop/SKILL.md @@ -0,0 +1,209 @@ +--- +name: prcheckloop +description: > + Iteratively gets a GitHub pull request's checks green. Detects the PR for the + current branch or uses a provided PR number, waits for every check on the + latest head SHA to appear and finish, investigates failing checks, fixes + actionable code or test issues, pushes, and repeats. Escalates with a precise + blocker when failures are external, flaky, or not safely fixable. Use when a + PR still has unsuccessful checks after review fixes, including after greploop. +--- + +# PRCheckloop + +Get a GitHub PR to a fully green check state, or exit with a concrete blocker. + +## Scope + +- GitHub PRs only. If the repo is GitLab, stop and use `check-pr`. +- Focus on checks for the latest PR head SHA, not old commits. +- Focus on CI/status checks, not review comments or PR template cleanup. +- If the user also wants review-comment cleanup, pair this with `check-pr`. + +## Inputs + +- **PR number** (optional): If not provided, detect the PR for the current branch. +- **Max iterations**: default `5`. + +## Workflow + +### 1. Identify the PR + +If no PR number is provided, detect it from the current branch: + +```bash +gh pr view --json number,headRefName,headRefOid,url,isDraft +``` + +If needed, switch to the PR branch before making changes. + +Stop early if: + +- `gh` is not authenticated +- there is no PR for the branch +- the repo is not hosted on GitHub + +### 2. Track the latest head SHA + +Always work against the current PR head SHA: + +```bash +PR_JSON=$(gh pr view "$PR_NUMBER" --json number,headRefName,headRefOid,url) +HEAD_SHA=$(echo "$PR_JSON" | jq -r .headRefOid) +PR_URL=$(echo "$PR_JSON" | jq -r .url) +``` + +Ignore failing checks from older SHAs. After every push, refresh `HEAD_SHA` and +restart the inspection loop. + +### 3. Inventory checks for that SHA + +Fetch both GitHub check runs and legacy commit status contexts: + +```bash +gh api "repos/{owner}/{repo}/commits/$HEAD_SHA/check-runs?per_page=100" +gh api "repos/{owner}/{repo}/commits/$HEAD_SHA/status" +``` + +For a compact PR-level view, this GraphQL payload is useful: + +```bash +gh api graphql -f query=' +query($owner:String!, $repo:String!, $pr:Int!) { + repository(owner:$owner, name:$repo) { + pullRequest(number:$pr) { + headRefOid + url + statusCheckRollup { + contexts(first:100) { + nodes { + __typename + ... on CheckRun { name status conclusion detailsUrl workflowName } + ... on StatusContext { context state targetUrl description } + } + } + } + } + } +}' -F owner=OWNER -F repo=REPO -F pr="$PR_NUMBER" +``` + +### 4. Wait for checks to actually run + +After a new push, checks can take a moment to appear. Poll every 15-30 seconds +until one of these is true: + +- checks have appeared and every item is in a terminal state +- checks have appeared and at least one failed +- no checks appear after a reasonable wait, usually 2 minutes + +Treat these as terminal success states: + +- check runs: `SUCCESS`, `NEUTRAL`, `SKIPPED` +- status contexts: `SUCCESS` + +Treat these as pending: + +- check runs: `QUEUED`, `PENDING`, `WAITING`, `REQUESTED`, `IN_PROGRESS` +- status contexts: `PENDING` + +Treat these as failures: + +- check runs: `FAILURE`, `TIMED_OUT`, `CANCELLED`, `ACTION_REQUIRED`, `STARTUP_FAILURE`, `STALE` +- status contexts: `FAILURE`, `ERROR` + +If no checks appear for the latest SHA, inspect `.github/workflows/`, workflow +path filters, and branch protection expectations. If the missing check cannot be +caused or fixed from the repo, escalate. + +### 5. Investigate failing checks + +For GitHub Actions failures, inspect runs and failed logs for the current SHA: + +```bash +gh run list --commit "$HEAD_SHA" --json databaseId,workflowName,status,conclusion,url,headSha +gh run view --json databaseId,name,workflowName,status,conclusion,jobs,url,headSha +gh run view --log-failed +``` + +For each failing check, classify it: + +| Failure type | Action | +|---|---| +| Code/test regression | Reproduce locally, fix, and verify | +| Lint/type/build mismatch | Run the matching local command from the workflow and fix it | +| Flake or transient infra issue | Rerun once if evidence supports flakiness | +| External service/status app failure | Escalate with the details URL and owner guess | +| Missing secret/permission/branch protection issue | Escalate immediately | + +Only rerun a failed job once without code changes. Do not loop on reruns. + +### 6. Fix actionable failures + +If the failure is actionable from the checked-out code: + +1. Read the workflow or failing command to identify the real gate. +2. Reproduce locally where reasonable. +3. Make the smallest correct fix. +4. Run focused verification first, then broader verification if needed. +5. Commit in a logical commit. +6. Push before re-checking the PR. + +Do not stop at a local fix. The loop is only complete when the remote PR checks +for the new head SHA are green. + +### 7. Push and repeat + +After each fix: + +```bash +git push +sleep 5 +``` + +Then refresh the PR metadata, get the new `HEAD_SHA`, and restart from Step 3. + +Exit the loop only when: + +- all checks for the latest head SHA are green, or +- a blocker remains after reasonable repair effort, or +- the max iteration count is reached + +### 8. Escalate blockers precisely + +If you cannot get the PR green, report: + +- PR URL +- latest head SHA +- exact failing or missing check names +- details URLs +- what you already tried +- why it is blocked +- who should likely unblock it +- the next concrete action + +Good blocker examples: + +- external status app outage +- missing GitHub secret or permission +- required check name mismatch in branch protection +- persistent flake after one rerun +- failure needs credentials or infrastructure access you do not have + +## Output + +When the skill completes, report: + +- PR URL and branch +- final head SHA +- green/pending/failing check summary +- fixes made and verification run +- whether changes were pushed +- blocker summary if not fully green + +## Notes + +- This skill is intentionally narrower than `check-pr`: it is a repair loop for + PR checks. +- This skill complements `greploop`: Greptile can be perfect while CI is still + red. diff --git a/.agents/skills/release-changelog/SKILL.md b/.agents/skills/release-changelog/SKILL.md new file mode 100644 index 0000000..d17c1f6 --- /dev/null +++ b/.agents/skills/release-changelog/SKILL.md @@ -0,0 +1,192 @@ +--- +name: release-changelog +description: > + Generate the stable Paperclip release changelog at releases/vYYYY.MDD.P.md by + reading commits, changesets, and merged PR context since the last stable tag. +--- + +# Release Changelog Skill + +Generate the user-facing changelog for the **stable** Paperclip release. + +## Versioning Model + +Paperclip uses **calendar versioning (calver)**: + +- Stable releases: `YYYY.MDD.P` (e.g. `2026.318.0`) +- Canary releases: `YYYY.MDD.P-canary.N` (e.g. `2026.318.1-canary.0`) +- Git tags: `vYYYY.MDD.P` for stable, `canary/vYYYY.MDD.P-canary.N` for canary + +There are no major/minor/patch bumps. The stable version is derived from the +intended release date (UTC) plus the next same-day stable patch slot. + +Output: + +- `releases/vYYYY.MDD.P.md` + +Important rules: + +- even if there are canary releases such as `2026.318.1-canary.0`, the changelog file stays `releases/v2026.318.1.md` +- do not derive versions from semver bump types +- do not create canary changelog files + +## Step 0 — Idempotency Check + +Before generating anything, check whether the file already exists: + +```bash +ls releases/vYYYY.MDD.P.md 2>/dev/null +``` + +If it exists: + +1. read it first +2. present it to the reviewer +3. ask whether to keep it, regenerate it, or update specific sections +4. never overwrite it silently + +## Step 1 — Determine the Stable Range + +Find the last stable tag: + +```bash +git tag --list 'v*' --sort=-version:refname | head -1 +git log v{last}..HEAD --oneline --no-merges +``` + +The stable version comes from one of: + +- an explicit maintainer request +- `./scripts/release.sh stable --date YYYY-MM-DD --print-version` +- the release plan already agreed in `doc/RELEASING.md` + +Do not derive the changelog version from a canary tag or prerelease suffix. +Do not derive major/minor/patch bumps from API intent — calver uses the date and same-day stable slot. + +## Step 2 — Gather the Raw Inputs + +Collect release data from: + +1. git commits since the last stable tag +2. `.changeset/*.md` files +3. merged PRs via `gh` when available + +Useful commands: + +```bash +git log v{last}..HEAD --oneline --no-merges +git log v{last}..HEAD --format="%H %s" --no-merges +ls .changeset/*.md | grep -v README.md +gh pr list --state merged --search "merged:>={last-tag-date}" --json number,title,body,labels +``` + +## Step 3 — Detect Breaking Changes + +Look for: + +- destructive migrations +- removed or changed API fields/endpoints +- renamed or removed config keys +- `BREAKING:` or `BREAKING CHANGE:` commit signals + +Key commands: + +```bash +git diff --name-only v{last}..HEAD -- packages/db/src/migrations/ +git diff v{last}..HEAD -- packages/db/src/schema/ +git diff v{last}..HEAD -- server/src/routes/ server/src/api/ +git log v{last}..HEAD --format="%s" | rg -n 'BREAKING CHANGE|BREAKING:|^[a-z]+!:' || true +``` + +If breaking changes are detected, flag them prominently — they must appear in the +Breaking Changes section with an upgrade path. + +## Step 4 — Categorize for Users + +Use these stable changelog sections: + +- `Breaking Changes` +- `Highlights` +- `Improvements` +- `Fixes` +- `Upgrade Guide` when needed + +Exclude purely internal refactors, CI changes, and docs-only work unless they materially affect users. + +Guidelines: + +- group related commits into one user-facing entry +- write from the user perspective +- keep highlights short and concrete +- spell out upgrade actions for breaking changes + +### Inline PR and contributor attribution + +When a bullet item clearly maps to a merged pull request, add inline attribution at the +end of the entry in this format: + +``` +- **Feature name** — Description. ([#123](https://github.com/paperclipai/paperclip/pull/123), @contributor1, @contributor2) +``` + +Rules: + +- Only add a PR link when you can confidently trace the bullet to a specific merged PR. + Use merge commit messages (`Merge pull request #N from user/branch`) to map PRs. +- List the contributor(s) who authored the PR. Use GitHub usernames, not real names or emails. +- If multiple PRs contributed to a single bullet, list them all: `([#10](url), [#12](url), @user1, @user2)`. +- If you cannot determine the PR number or contributor with confidence, omit the attribution + parenthetical — do not guess. +- Core maintainer commits that don't have an external PR can omit the parenthetical. + +## Step 5 — Write the File + +Template: + +```markdown +# vYYYY.MDD.P + +> Released: YYYY-MM-DD + +## Breaking Changes + +## Highlights + +## Improvements + +## Fixes + +## Upgrade Guide + +## Contributors + +Thank you to everyone who contributed to this release! + +@username1, @username2, @username3 +``` + +Omit empty sections except `Highlights`, `Improvements`, and `Fixes`, which should usually exist. + +The `Contributors` section should always be included. List every person who authored +commits in the release range, @-mentioning them by their **GitHub username** (not their +real name or email). To find GitHub usernames: + +1. Extract usernames from merge commit messages: `git log v{last}..HEAD --oneline --merges` — the branch prefix (e.g. `from username/branch`) gives the GitHub username. +2. For noreply emails like `user@users.noreply.github.com`, the username is the part before `@`. +3. For contributors whose username is ambiguous, check `gh api users/{guess}` or the PR page. + +**Never expose contributor email addresses.** Use `@username` only. + +Exclude bot accounts (e.g. `lockfile-bot`, `dependabot`) from the list. List contributors +in alphabetical order by GitHub username (case-insensitive). + +## Step 6 — Review Before Release + +Before handing it off: + +1. confirm the heading is the stable version only +2. confirm there is no `-canary` language in the title or filename +3. confirm any breaking changes have an upgrade path +4. present the draft for human sign-off + +This skill never publishes anything. It only prepares the stable changelog artifact. diff --git a/.agents/skills/release/SKILL.md b/.agents/skills/release/SKILL.md new file mode 100644 index 0000000..8f8e7ca --- /dev/null +++ b/.agents/skills/release/SKILL.md @@ -0,0 +1,247 @@ +--- +name: release +description: > + Coordinate a full Paperclip release across engineering verification, npm, + GitHub, smoke testing, and announcement follow-up. Use when leadership asks + to ship a release, not merely to discuss versioning. +--- + +# Release Coordination Skill + +Run the full Paperclip maintainer release workflow, not just an npm publish. + +This skill coordinates: + +- stable changelog drafting via `release-changelog` +- canary verification and publish status from `master` +- Docker smoke testing via `scripts/docker-onboard-smoke.sh` +- manual stable promotion from a chosen source ref +- GitHub Release creation +- website / announcement follow-up tasks + +## Trigger + +Use this skill when leadership asks for: + +- "do a release" +- "ship the release" +- "promote this canary to stable" +- "cut the stable release" + +## Preconditions + +Before proceeding, verify all of the following: + +1. `.agents/skills/release-changelog/SKILL.md` exists and is usable. +2. The repo working tree is clean, including untracked files. +3. There is at least one canary or candidate commit since the last stable tag. +4. The candidate SHA has passed the verification gate or is about to. +5. If manifests changed, the CI-owned `pnpm-lock.yaml` refresh is already merged on `master`. +6. npm publish rights are available through GitHub trusted publishing, or through local npm auth for emergency/manual use. +7. If running through Paperclip, you have issue context for status updates and follow-up task creation. + +If any precondition fails, stop and report the blocker. + +## Inputs + +Collect these inputs up front: + +- whether the target is a canary check or a stable promotion +- the candidate `source_ref` for stable +- whether the stable run is dry-run or live +- release issue / company context for website and announcement follow-up + +## Step 0 — Release Model + +Paperclip now uses a commit-driven release model: + +1. every push to `master` publishes a canary automatically +2. canaries use `YYYY.MDD.P-canary.N` +3. stable releases use `YYYY.MDD.P` +4. the middle slot is `MDD`, where `M` is the UTC month and `DD` is the zero-padded UTC day +5. the stable patch slot increments when more than one stable ships on the same UTC date +6. stable releases are manually promoted from a chosen tested commit or canary source commit +7. only stable releases get `releases/vYYYY.MDD.P.md`, git tag `vYYYY.MDD.P`, and a GitHub Release + +Critical consequences: + +- do not use release branches as the default path +- do not derive major/minor/patch bumps +- do not create canary changelog files +- do not create canary GitHub Releases + +## Step 1 — Choose the Candidate + +For canary validation: + +- inspect the latest successful canary run on `master` +- record the canary version and source SHA + +For stable promotion: + +1. choose the tested source ref +2. confirm it is the exact SHA you want to promote +3. resolve the target stable version with `./scripts/release.sh stable --date YYYY-MM-DD --print-version` + +Useful commands: + +```bash +git tag --list 'v*' --sort=-version:refname | head -1 +git log --oneline --no-merges +npm view paperclipai@canary version +``` + +## Step 2 — Draft the Stable Changelog + +Stable changelog files live at: + +- `releases/vYYYY.MDD.P.md` + +Invoke `release-changelog` and generate or update the stable notes only. + +Rules: + +- review the draft with a human before publish +- preserve manual edits if the file already exists +- keep the filename stable-only +- do not create a canary changelog file + +## Step 3 — Verify the Candidate SHA + +Run the standard gate: + +```bash +pnpm -r typecheck +pnpm test:run +pnpm build +``` + +If the GitHub release workflow will run the publish, it can rerun this gate. Still report local status if you checked it. + +For PRs that touch release logic, the repo also runs a canary release dry-run in CI. That is a release-specific guard, not a substitute for the standard gate. + +## Step 4 — Validate the Canary + +The normal canary path is automatic from `master` via: + +- `.github/workflows/release.yml` + +Confirm: + +1. verification passed +2. npm canary publish succeeded +3. git tag `canary/vYYYY.MDD.P-canary.N` exists + +Useful checks: + +```bash +npm view paperclipai@canary version +git tag --list 'canary/v*' --sort=-version:refname | head -5 +``` + +## Step 5 — Smoke Test the Canary + +Run: + +```bash +PAPERCLIPAI_VERSION=canary ./scripts/docker-onboard-smoke.sh +``` + +Useful isolated variant: + +```bash +HOST_PORT=3232 DATA_DIR=./data/release-smoke-canary PAPERCLIPAI_VERSION=canary ./scripts/docker-onboard-smoke.sh +``` + +Confirm: + +1. install succeeds +2. onboarding completes without crashes +3. the server boots +4. the UI loads +5. basic company creation and dashboard load work + +If smoke testing fails: + +- stop the stable release +- fix the issue on `master` +- wait for the next automatic canary +- rerun smoke testing + +## Step 6 — Preview or Publish Stable + +The normal stable path is manual `workflow_dispatch` on: + +- `.github/workflows/release.yml` + +Inputs: + +- `source_ref` +- `stable_date` +- `dry_run` + +Before live stable: + +1. resolve the target stable version with `./scripts/release.sh stable --date YYYY-MM-DD --print-version` +2. ensure `releases/vYYYY.MDD.P.md` exists on the source ref +3. run the stable workflow in dry-run mode first when practical +4. then run the real stable publish + +The stable workflow: + +- re-verifies the exact source ref +- computes the next stable patch slot for the chosen UTC date +- publishes `YYYY.MDD.P` under dist-tag `latest` +- creates git tag `vYYYY.MDD.P` +- creates or updates the GitHub Release from `releases/vYYYY.MDD.P.md` + +Local emergency/manual commands: + +```bash +./scripts/release.sh stable --dry-run +./scripts/release.sh stable +git push public-gh refs/tags/vYYYY.MDD.P +./scripts/create-github-release.sh YYYY.MDD.P +``` + +## Step 7 — Finish the Other Surfaces + +Create or verify follow-up work for: + +- website changelog publishing +- launch post / social announcement +- release summary in Paperclip issue context + +These should reference the stable release, not the canary. + +## Failure Handling + +If the canary is bad: + +- publish another canary, do not ship stable + +If stable npm publish succeeds but tag push or GitHub release creation fails: + +- fix the git/GitHub issue immediately from the same release result +- do not republish the same version + +If `latest` is bad after stable publish: + +```bash +./scripts/rollback-latest.sh +``` + +Then fix forward with a new stable release. + +## Output + +When the skill completes, provide: + +- candidate SHA and tested canary version, if relevant +- stable version, if promoted +- verification status +- npm status +- smoke-test status +- git tag / GitHub Release status +- website / announcement follow-up status +- rollback recommendation if anything is still partially complete diff --git a/.gitignore b/.gitignore index 1572fbf..0425a0f 100644 --- a/.gitignore +++ b/.gitignore @@ -39,7 +39,6 @@ feedback-export-* .claude/settings.local.json .taskcore-local/ /.idea/ -/.agents/ # Doc maintenance cursor .doc-review-cursor