Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .agents/skills/draft_guide/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,29 @@ When drafting a guide, check for relevant SEO and AEO data:
4. **Frame the title for non-branded search.** The page should answer the user's actual question, with Warp features as the natural solution in the guide body.
5. **Avoid keyword stuffing.** Preserve high-intent query terms only where they make the guide clearer or more discoverable. Rewrite awkward source-data phrasing into natural developer language.

## Oz CLI and GitHub Actions accuracy

When a guide includes Oz CLI commands or GitHub Actions workflows using `warpdotdev/oz-agent-action`:

- **Verify Oz CLI commands against `/reference/cli/`.** Do not infer flag names or argument formats. Use only flags documented in the CLI reference. When in doubt, link to the reference page instead of showing a command.
- **`oz-agent-action` input format**: `warp_api_key` is a `with:` input to the action, not an `env:` variable. `GITHUB_TOKEN` goes in `env:`. The correct pattern is:
```yaml
- uses: warpdotdev/oz-agent-action@v1
with:
skill: SKILL_NAME
environment: YOUR_OZ_ENVIRONMENT_SLUG
warp_api_key: ${{ secrets.WARP_API_KEY }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
```
- **Every GitHub Actions workflow example that performs write operations must include a `permissions:` block** at the workflow level, before `jobs:`. Use the minimum permissions for the task:
- Triage (label/comment on issues): `issues: write`
- Spec/implementation (push branches, open PRs): `contents: write`, `pull-requests: write`
- Review (post PR comments): `pull-requests: write`
Without an explicit `permissions:` block, workflows fail in repositories with restricted default permissions.
- **`oz secret create` must not include `--value` on the command line.** The `--value` flag exposes secrets in shell history and process arguments. Show `oz secret create --name SECRET_NAME` and note that the CLI prompts for the value interactively.
- **Slash commands** (`/command-name`) in standalone code fences should use `bash` as the language identifier, consistent with the docs style guide for terminal input.

## Third-party tool accuracy

When a guide documents a third-party tool (Claude Code, Codex, OpenCode, etc.):
Expand Down
3 changes: 1 addition & 2 deletions .agents/templates/conceptual.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
---
title: [Feature or concept name — sentence case. Title convention: noun or "About [subject]". The title field renders as the page H1; do not add a separate H1 in the body.]
description: >-
[1-2 sentences: what the concept/feature is + why it matters.
Write as a standalone summary for search results. Lead with user benefit.]
---

# [Feature or concept name — sentence case. Title convention: noun or "About [subject]"]

[Opening paragraph: What this feature/concept is and its primary benefit.
1-3 sentences. Lead with what the user gains from understanding this.]

Expand Down
5 changes: 1 addition & 4 deletions .agents/templates/guide-page.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,10 @@
---
title: [Task-oriented title in sentence case — reads like a search query. Capture the non-branded query a developer would actually search for, not "How to do X in Warp." The title field renders as the page H1; do not add a separate H1 in the body.]
description: >-
[1-2 sentence summary of what this guide covers and what the reader will
achieve. Keep under 160 characters for SEO.]
---

# [Task-oriented title in sentence case — reads like a search query]

[TITLE GUIDANCE: Title should describe what the reader will DO, not what the feature IS. For SEO, capture the non-branded query — write the title a developer would actually search for, not "How to do X in Warp." Good: "How to set up Claude Code". Bad: "How to set up Claude Code in Warp".]

[One sentence: what you'll accomplish by following this guide. Mention Warp by name. Include a time estimate if possible (e.g., "takes about 10 minutes").]

[AEO GUIDANCE: If this guide is based on Peec, answer-engine prompts, search-query data, or AEO goals, create an AEO brief first using `.agents/skills/aeo_brief/SKILL.md`. Use the brief to preserve high-intent vocabulary naturally, translate awkward source-data phrasing into developer-friendly docs language, and decide whether this should be a new guide or an update to an existing page.]
Expand Down
76 changes: 76 additions & 0 deletions src/content/docs/agent-platform/cloud-agents/software-factory.mdx

@rachaelrenk rachaelrenk Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to feedback on the location for this conceptual doc. Currently in Agents > Cloud Agents > Software Factory

Image

Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
---
title: Software factory
description: >-
A software factory uses specialized agents to take new issues through triage, spec, implementation, and review, producing pull requests for your team to merge.
sidebar:
label: "Software factory"
---

A software factory is a development system where specialized agents take new issues through triage, spec, implementation, and review, producing pull requests for your team to merge. Instead of every developer executing every step — reading each issue, writing specs, implementing changes, reviewing output — agents execute and humans review. The team's job shifts from doing the work to defining the process and raising the quality bar over time.

Warp uses this model to build Warp itself. The [agent dashboard at build.warp.dev](https://build.warp.dev) shows the work Warp's agents are tackling across the open source repository in real time.

Read on to learn about the two loops that make up a software factory, the agent roles involved, and how to decide if this model is right for your team.

## The two loops

A software factory has two loops that work together.

**The inner loop** is the execution loop: a triage agent, spec agent, implementation agent, and reviewer agent work in sequence, turning a new issue into a pull request for human review.

**The outer loop** is the improvement loop: a scheduled agent reviews past inner-loop runs, observes where maintainers made corrections, and opens a pull request to update the skill files that drive the inner-loop agents. Over time, the factory gets better without anyone manually rewriting prompts.

The inner loop ships software. The outer loop improves the factory that ships it.

## Agent roles

A software factory delegates work to specialized agents, each with a narrow responsibility:

* **Triage agent** - Reviews new issues for clarity and completeness. Labels issues as ready to implement, needs more information, or a duplicate. Flags open questions for the reporter before implementation begins, so the backlog stays clean and actionable.
* **Spec agent** - Drafts a product spec (user stories and acceptance criteria) and a tech spec (implementation strategy, relevant code locations, and edge cases). The specs serve as the blueprint for the implementation agent and the review criteria for the reviewer agent.
* **Implementation agent** - Builds against approved specs. Uses spec context to make better architectural decisions and raises a blocked status when specs are missing or ambiguous rather than guessing.
* **Reviewer agent** - Checks the implementation against specs, code conventions, and security requirements. Posts inline review comments and validates that acceptance criteria are met before a human reviewer opens the PR.

The [`warpdotdev/oz-for-oss`](https://github.com/warpdotdev/oz-for-oss) repository is the complete reference implementation with all four roles deployed in a working system.

## How it works

Each agent role is backed by a **skill**, a markdown file checked into a Git repository that defines the agent's behavior: what to check, how to classify results, what to output, and when to escalate. Skills are versioned, reviewed as code changes, and composable across repositories.

The inner loop runs when a new issue is filed:

1. A webhook or GitHub Action triggers the triage agent. The agent runs in a cloud environment, analyzes the issue, and applies labels and comments.
2. When the issue is labeled `ready-to-spec`, the spec agent creates `PRODUCT.md` and `TECH.md` in a `specs/` directory in the repository.
3. A human reviews and approves the specs.
4. When the implementation label is applied, the implementation agent opens a PR that includes the spec files alongside the code.

Oz orchestrates each agent as a cloud run. Every run has its own environment (repository checkout, secrets, toolchain), its own permissions, and a session link your team can use to inspect what the agent did, steer it mid-run, or hand work back to a local session.

The outer loop runs on a schedule, not in response to events.

A scheduled cloud agent collects signals from past inner-loop runs and generates a diff to the relevant skill files. That diff goes through a normal pull request review before merging. Humans decide what improves; agents propose it.

## When to use a software factory

A software factory is a strong fit when:

* Your team has a repeatable development workflow: a backlog of issues with consistent shape, and a process you can write down and teach to an agent.
* The cost of a missed edge case in agent output is recoverable. The agent opens a pull request, not a deploy, so humans stay in control of what ships.
* You want to scale throughput without scaling headcount linearly, or you want to move faster on a large backlog with a small team.

Start with one agent role. Most teams start with a triage agent, which is the simplest loop to close correctly. A well-groomed backlog immediately benefits every developer on the team. Add spec, implementation, and reviewer agents as you build confidence in each step.

## Reference implementation

[`warpdotdev/oz-for-oss`](https://github.com/warpdotdev/oz-for-oss) is Warp's open-source software factory platform for GitHub-hosted repositories. It includes a Vercel webhook layer, GitHub App, and skill files for every agent role in the loop: triage, spec, implementation, review, verification, and self-improvement. Get started with the [onboarding guide](https://github.com/warpdotdev/oz-for-oss/blob/main/docs/onboarding.md).

## Related pages

* [Build a triage agent for your issue backlog](/guides/agent-workflows/build-a-triage-agent) — Start the series with the simplest agent role.
* [Write product and tech specs with agents](/guides/agent-workflows/write-product-and-tech-specs-with-agents) — Write specs that guide implementation agents.
* [Set up your software factory: from triage to PR](/guides/agent-workflows/set-up-a-software-factory) — Connect the four roles into a working loop.
* [Run a software factory in the cloud](/guides/agent-workflows/run-a-software-factory-in-the-cloud) — Move the loop off your laptop with Oz.
* [Build a self-improving agent](/guides/agent-workflows/build-a-self-improving-agent) — Add the outer improvement loop.
* [Skills](/agent-platform/capabilities/skills) — How skill files work in Warp and Oz.
* [Cloud agents overview](/agent-platform/cloud-agents/overview) — Setting up cloud agents on Oz.
* [Deployment patterns](/agent-platform/cloud-agents/deployment-patterns) — Common architectures for cloud agent deployment.
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
---
title: Build a self-improving agent
description: >-
Build an outer improvement loop where a scheduled agent reviews past runs, learns from team corrections, and proposes skill file updates — so your factory gets better over time.
sidebar:
label: "Build a self-improving agent"
tags:
- "agents"
- "software-factory"
- "cloud-agents"
- "schedules"
---

A self-improving agent is the outer loop of a software factory: it watches how maintainers correct the inner-loop agents — relabeled issues, edited comments, changed code — and opens a pull request to improve the skill files that drive those agents. Every correction from a teammate becomes a proposed improvement to the factory.

This guide shows how to build the outer loop for a triage agent. The same pattern applies to any agent role.

:::note
The outer loop proposes improvements; it doesn't apply them silently. Every change to a skill file goes through a normal pull request review before merging. The team decides what improves; the agent proposes it.
:::

## Prerequisites

* A working inner loop with at least one agent running ([set up your software factory](/guides/agent-workflows/set-up-a-software-factory))
* A Warp account ([sign up at warp.dev](https://www.warp.dev))
* An Oz cloud environment with access to your repository ([create one](/agent-platform/cloud-agents/environments))

## Why principles beat rules

Most agent skills start as a list of rules: "if the reporter doesn't include a reproduction step, add `needs-info`." Rules overfit. They break when a situation arrives that the rules didn't anticipate, and a longer rule list makes the skill harder to maintain.

Effective skill files describe **principles**: durable ideas about how to approach a class of situation. A principle — "confirm that a bug report includes enough context for a fresh contributor to reproduce the issue" — transfers to situations the original rules didn't cover. When the outer loop proposes a skill update, it should always aim for a principle that generalizes, not a new rule that handles one more edge case.

## 1. Collect correction signals

The outer loop needs signals that the inner loop got something wrong. For a triage agent, the most useful signals are:

* **Maintainer relabels** — A maintainer changes a label the triage agent applied (for example, `needs-info` → `ready-to-implement`).
* **Maintainer re-opens** — A maintainer reopens an issue the triage agent closed.
* **Follow-up comments** — A maintainer explains in a comment why the triage agent's assessment was wrong.

GitHub captures all of these in the issue timeline and audit log.

The [`update-triage`](https://github.com/warpdotdev/oz-for-oss/blob/main/.agents/skills/update-triage/SKILL.md) skill in `warpdotdev/oz-for-oss` includes a Python helper script (`aggregate_triage_feedback.py`) that collects these signals for the past N days. Copy and adapt it for your repository.

## 2. Create the update skill

1. Create `.agents/skills/update-triage/SKILL.md` with the following content as a starting point:

```markdown
---
name: update-triage
description: Review recent maintainer corrections to the triage agent's output and propose skill file updates that generalize from those corrections.
---

# Update triage skill

You are reviewing recent maintainer corrections to the triage agent's output.

## What to look for

For each correction:
1. Identify what the triage agent did (the original label or comment).
2. Identify what the maintainer did instead (the correction).
3. Ask: why would the triage agent have gotten this wrong?
4. Ask: what principle would prevent this class of error in the future?

## What to update

Update only the triage-issue-local companion skill for this repository.
Write principles, not rules. Do not update the shared triage-issue skill.

Commit changes to a branch named oz-agent/update-triage.
Do not push directly to main. Open a pull request for human review.
```

2. For the complete implementation — including the correction-collection script, write-surface restrictions, and PR-opening workflow — use the [`update-triage` skill from `warpdotdev/oz-for-oss`](https://github.com/warpdotdev/oz-for-oss/blob/main/.agents/skills/update-triage/SKILL.md) as your reference.

## 3. Schedule the outer loop

Weekly is a good starting cadence: it processes the previous week's corrections and opens PRs for review at the start of the week.

1. Create a scheduled cloud agent from the Oz CLI:

```bash
oz schedule create \
--name update-triage-weekly \
--skill update-triage \
--environment YOUR_ENVIRONMENT_SLUG \
--cron "0 9 * * 1"
```

Or, from the Oz web app: open **Agents** > **Schedules**, click **New schedule**, and set the skill, environment, and cron expression.

2. Replace `YOUR_ENVIRONMENT_SLUG` with the slug of your Oz environment.

See [Scheduled agents](/agent-platform/cloud-agents/triggers/scheduled-agents) for the full reference.

## 4. Review the proposed skill changes

Each time the outer loop runs, it opens a pull request with proposed changes to the `triage-issue-local` companion skill. Review that PR the same way you would review any other code change:

* Does the proposed principle generalize correctly, or is it another specific rule in disguise?
* Is the change scoped to the companion skill rather than the shared base skill?
* Does it improve the agent's reasoning, or does it just patch one edge case?

Merge the PR when the improvement is sound. Close it if the proposal is too narrow or incorrect.

Over time, the companion skill accumulates a clear description of how your team thinks about issue triage — not as a list of rules, but as a set of principles that a new maintainer or a new agent can learn from.

## Productivity tips

* **Give feedback in the right place** — The outer loop collects signals from GitHub. If you want the triage agent to learn from a correction, add a comment to the issue explaining why the original label was wrong. That comment becomes signal for the next outer-loop run.
* **Apply the pattern to other agent roles** — The [`update-pr-review`](https://github.com/warpdotdev/oz-for-oss/blob/main/.agents/skills/update-pr-review/SKILL.md) and [`update-dedupe`](https://github.com/warpdotdev/oz-for-oss/blob/main/.agents/skills/update-dedupe/SKILL.md) skills from `warpdotdev/oz-for-oss` apply the same outer-loop pattern to the PR reviewer and deduplication agents.
* **Don't skip the PR review** — The value of the outer loop is compounding improvement, not speed. A bad skill change that merges silently creates more work than it saves. Every proposed change should get a review.

## Next steps

* [What is a software factory?](/agent-platform/cloud-agents/software-factory) — How the outer improvement loop fits into the full factory model.
* [Set up your software factory: from triage to PR](/guides/agent-workflows/set-up-a-software-factory) — The inner loop the outer loop improves.
* [Run a software factory in the cloud](/guides/agent-workflows/run-a-software-factory-in-the-cloud) — Move the loop to Oz for team-wide visibility.
* [Scheduled agents](/agent-platform/cloud-agents/triggers/scheduled-agents) — Full reference for running cloud agents on a cadence.
* [`warpdotdev/oz-for-oss`](https://github.com/warpdotdev/oz-for-oss) — The complete reference implementation including all outer-loop skills.
* [Skills](/agent-platform/capabilities/skills) — How skill files work in Warp and Oz.
Loading
Loading