-
Notifications
You must be signed in to change notification settings - Fork 13
docs: add software factory concept page and 5 guides #236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rachaelrenk
wants to merge
12
commits into
main
Choose a base branch
from
docs/software-factory
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
b67a883
docs: add software factory concept page and guides
rachaelrenk 6c767e6
Merge branch 'main' into docs/software-factory
rachaelrenk 104c0ec
fix: normalize sidebar.ts indentation for deployment-patterns, softwa…
rachaelrenk 1366a65
fix: address oz-for-oss bot review comments
rachaelrenk 449f140
Merge branch 'main' into docs/software-factory
rachaelrenk fbd37e7
docs(software-factory): address concept page review feedback
rachaelrenk 389a15d
Rename and retitle software factory guides; remove 'How to' from all …
rachaelrenk fa78648
Polish software-factory.mdx: tighten phrasing, add numbered steps to …
rachaelrenk c1fa3dc
Address review comments on build-a-triage-agent and software-factory
rachaelrenk 0dde269
Merge branch 'main' into docs/software-factory
rachaelrenk 19be545
Reformat prerequisites sections to match screenshot style across all …
rachaelrenk d294e68
Remove unnecessary redirect: chain-a-software-factory was never publi…
rachaelrenk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
76 changes: 76 additions & 0 deletions
76
src/content/docs/agent-platform/cloud-agents/software-factory.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,76 @@ | ||
| --- | ||
| title: Software factory | ||
| description: >- | ||
| A software factory uses specialized agents to take new issues through triage, spec, implementation, and review, producing pull requests for your team to merge. | ||
| sidebar: | ||
| label: "Software factory" | ||
| --- | ||
|
|
||
| A software factory is a development system where specialized agents take new issues through triage, spec, implementation, and review, producing pull requests for your team to merge. Instead of every developer executing every step — reading each issue, writing specs, implementing changes, reviewing output — agents execute and humans review. The team's job shifts from doing the work to defining the process and raising the quality bar over time. | ||
|
|
||
| Warp uses this model to build Warp itself. The [agent dashboard at build.warp.dev](https://build.warp.dev) shows the work Warp's agents are tackling across the open source repository in real time. | ||
|
|
||
| Read on to learn about the two loops that make up a software factory, the agent roles involved, and how to decide if this model is right for your team. | ||
|
|
||
| ## The two loops | ||
|
|
||
| A software factory has two loops that work together. | ||
|
|
||
| **The inner loop** is the execution loop: a triage agent, spec agent, implementation agent, and reviewer agent work in sequence, turning a new issue into a pull request for human review. | ||
|
|
||
| **The outer loop** is the improvement loop: a scheduled agent reviews past inner-loop runs, observes where maintainers made corrections, and opens a pull request to update the skill files that drive the inner-loop agents. Over time, the factory gets better without anyone manually rewriting prompts. | ||
|
|
||
| The inner loop ships software. The outer loop improves the factory that ships it. | ||
|
|
||
| ## Agent roles | ||
|
|
||
| A software factory delegates work to specialized agents, each with a narrow responsibility: | ||
|
|
||
| * **Triage agent** - Reviews new issues for clarity and completeness. Labels issues as ready to implement, needs more information, or a duplicate. Flags open questions for the reporter before implementation begins, so the backlog stays clean and actionable. | ||
| * **Spec agent** - Drafts a product spec (user stories and acceptance criteria) and a tech spec (implementation strategy, relevant code locations, and edge cases). The specs serve as the blueprint for the implementation agent and the review criteria for the reviewer agent. | ||
| * **Implementation agent** - Builds against approved specs. Uses spec context to make better architectural decisions and raises a blocked status when specs are missing or ambiguous rather than guessing. | ||
| * **Reviewer agent** - Checks the implementation against specs, code conventions, and security requirements. Posts inline review comments and validates that acceptance criteria are met before a human reviewer opens the PR. | ||
|
|
||
| The [`warpdotdev/oz-for-oss`](https://github.com/warpdotdev/oz-for-oss) repository is the complete reference implementation with all four roles deployed in a working system. | ||
|
|
||
| ## How it works | ||
|
|
||
| Each agent role is backed by a **skill**, a markdown file checked into a Git repository that defines the agent's behavior: what to check, how to classify results, what to output, and when to escalate. Skills are versioned, reviewed as code changes, and composable across repositories. | ||
|
|
||
| The inner loop runs when a new issue is filed: | ||
|
|
||
| 1. A webhook or GitHub Action triggers the triage agent. The agent runs in a cloud environment, analyzes the issue, and applies labels and comments. | ||
| 2. When the issue is labeled `ready-to-spec`, the spec agent creates `PRODUCT.md` and `TECH.md` in a `specs/` directory in the repository. | ||
| 3. A human reviews and approves the specs. | ||
| 4. When the implementation label is applied, the implementation agent opens a PR that includes the spec files alongside the code. | ||
|
|
||
| Oz orchestrates each agent as a cloud run. Every run has its own environment (repository checkout, secrets, toolchain), its own permissions, and a session link your team can use to inspect what the agent did, steer it mid-run, or hand work back to a local session. | ||
|
|
||
| The outer loop runs on a schedule, not in response to events. | ||
|
|
||
| A scheduled cloud agent collects signals from past inner-loop runs and generates a diff to the relevant skill files. That diff goes through a normal pull request review before merging. Humans decide what improves; agents propose it. | ||
|
|
||
| ## When to use a software factory | ||
|
|
||
| A software factory is a strong fit when: | ||
|
|
||
| * Your team has a repeatable development workflow: a backlog of issues with consistent shape, and a process you can write down and teach to an agent. | ||
| * The cost of a missed edge case in agent output is recoverable. The agent opens a pull request, not a deploy, so humans stay in control of what ships. | ||
| * You want to scale throughput without scaling headcount linearly, or you want to move faster on a large backlog with a small team. | ||
|
|
||
| Start with one agent role. Most teams start with a triage agent, which is the simplest loop to close correctly. A well-groomed backlog immediately benefits every developer on the team. Add spec, implementation, and reviewer agents as you build confidence in each step. | ||
|
|
||
| ## Reference implementation | ||
|
|
||
| [`warpdotdev/oz-for-oss`](https://github.com/warpdotdev/oz-for-oss) is Warp's open-source software factory platform for GitHub-hosted repositories. It includes a Vercel webhook layer, GitHub App, and skill files for every agent role in the loop: triage, spec, implementation, review, verification, and self-improvement. Get started with the [onboarding guide](https://github.com/warpdotdev/oz-for-oss/blob/main/docs/onboarding.md). | ||
|
|
||
| ## Related pages | ||
|
|
||
| * [Build a triage agent for your issue backlog](/guides/agent-workflows/build-a-triage-agent) — Start the series with the simplest agent role. | ||
| * [Write product and tech specs with agents](/guides/agent-workflows/write-product-and-tech-specs-with-agents) — Write specs that guide implementation agents. | ||
| * [Set up your software factory: from triage to PR](/guides/agent-workflows/set-up-a-software-factory) — Connect the four roles into a working loop. | ||
| * [Run a software factory in the cloud](/guides/agent-workflows/run-a-software-factory-in-the-cloud) — Move the loop off your laptop with Oz. | ||
| * [Build a self-improving agent](/guides/agent-workflows/build-a-self-improving-agent) — Add the outer improvement loop. | ||
| * [Skills](/agent-platform/capabilities/skills) — How skill files work in Warp and Oz. | ||
| * [Cloud agents overview](/agent-platform/cloud-agents/overview) — Setting up cloud agents on Oz. | ||
| * [Deployment patterns](/agent-platform/cloud-agents/deployment-patterns) — Common architectures for cloud agent deployment. |
124 changes: 124 additions & 0 deletions
124
src/content/docs/guides/agent-workflows/build-a-self-improving-agent.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,124 @@ | ||
| --- | ||
| title: Build a self-improving agent | ||
| description: >- | ||
| Build an outer improvement loop where a scheduled agent reviews past runs, learns from team corrections, and proposes skill file updates — so your factory gets better over time. | ||
| sidebar: | ||
| label: "Build a self-improving agent" | ||
| tags: | ||
| - "agents" | ||
| - "software-factory" | ||
| - "cloud-agents" | ||
| - "schedules" | ||
| --- | ||
|
|
||
| A self-improving agent is the outer loop of a software factory: it watches how maintainers correct the inner-loop agents — relabeled issues, edited comments, changed code — and opens a pull request to improve the skill files that drive those agents. Every correction from a teammate becomes a proposed improvement to the factory. | ||
|
|
||
| This guide shows how to build the outer loop for a triage agent. The same pattern applies to any agent role. | ||
|
|
||
| :::note | ||
| The outer loop proposes improvements; it doesn't apply them silently. Every change to a skill file goes through a normal pull request review before merging. The team decides what improves; the agent proposes it. | ||
| ::: | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| * A working inner loop with at least one agent running ([set up your software factory](/guides/agent-workflows/set-up-a-software-factory)) | ||
| * A Warp account ([sign up at warp.dev](https://www.warp.dev)) | ||
| * An Oz cloud environment with access to your repository ([create one](/agent-platform/cloud-agents/environments)) | ||
|
|
||
| ## Why principles beat rules | ||
|
|
||
| Most agent skills start as a list of rules: "if the reporter doesn't include a reproduction step, add `needs-info`." Rules overfit. They break when a situation arrives that the rules didn't anticipate, and a longer rule list makes the skill harder to maintain. | ||
|
|
||
| Effective skill files describe **principles**: durable ideas about how to approach a class of situation. A principle — "confirm that a bug report includes enough context for a fresh contributor to reproduce the issue" — transfers to situations the original rules didn't cover. When the outer loop proposes a skill update, it should always aim for a principle that generalizes, not a new rule that handles one more edge case. | ||
|
|
||
| ## 1. Collect correction signals | ||
|
|
||
| The outer loop needs signals that the inner loop got something wrong. For a triage agent, the most useful signals are: | ||
|
|
||
| * **Maintainer relabels** — A maintainer changes a label the triage agent applied (for example, `needs-info` → `ready-to-implement`). | ||
| * **Maintainer re-opens** — A maintainer reopens an issue the triage agent closed. | ||
| * **Follow-up comments** — A maintainer explains in a comment why the triage agent's assessment was wrong. | ||
|
|
||
| GitHub captures all of these in the issue timeline and audit log. | ||
|
|
||
| The [`update-triage`](https://github.com/warpdotdev/oz-for-oss/blob/main/.agents/skills/update-triage/SKILL.md) skill in `warpdotdev/oz-for-oss` includes a Python helper script (`aggregate_triage_feedback.py`) that collects these signals for the past N days. Copy and adapt it for your repository. | ||
|
|
||
| ## 2. Create the update skill | ||
|
|
||
| 1. Create `.agents/skills/update-triage/SKILL.md` with the following content as a starting point: | ||
|
|
||
| ```markdown | ||
| --- | ||
| name: update-triage | ||
| description: Review recent maintainer corrections to the triage agent's output and propose skill file updates that generalize from those corrections. | ||
| --- | ||
|
|
||
| # Update triage skill | ||
|
|
||
| You are reviewing recent maintainer corrections to the triage agent's output. | ||
|
|
||
| ## What to look for | ||
|
|
||
| For each correction: | ||
| 1. Identify what the triage agent did (the original label or comment). | ||
| 2. Identify what the maintainer did instead (the correction). | ||
| 3. Ask: why would the triage agent have gotten this wrong? | ||
| 4. Ask: what principle would prevent this class of error in the future? | ||
|
|
||
| ## What to update | ||
|
|
||
| Update only the triage-issue-local companion skill for this repository. | ||
| Write principles, not rules. Do not update the shared triage-issue skill. | ||
|
|
||
| Commit changes to a branch named oz-agent/update-triage. | ||
| Do not push directly to main. Open a pull request for human review. | ||
| ``` | ||
|
|
||
| 2. For the complete implementation — including the correction-collection script, write-surface restrictions, and PR-opening workflow — use the [`update-triage` skill from `warpdotdev/oz-for-oss`](https://github.com/warpdotdev/oz-for-oss/blob/main/.agents/skills/update-triage/SKILL.md) as your reference. | ||
|
|
||
| ## 3. Schedule the outer loop | ||
|
|
||
| Weekly is a good starting cadence: it processes the previous week's corrections and opens PRs for review at the start of the week. | ||
|
|
||
| 1. Create a scheduled cloud agent from the Oz CLI: | ||
|
|
||
| ```bash | ||
| oz schedule create \ | ||
| --name update-triage-weekly \ | ||
| --skill update-triage \ | ||
| --environment YOUR_ENVIRONMENT_SLUG \ | ||
| --cron "0 9 * * 1" | ||
| ``` | ||
|
|
||
| Or, from the Oz web app: open **Agents** > **Schedules**, click **New schedule**, and set the skill, environment, and cron expression. | ||
|
|
||
| 2. Replace `YOUR_ENVIRONMENT_SLUG` with the slug of your Oz environment. | ||
|
|
||
| See [Scheduled agents](/agent-platform/cloud-agents/triggers/scheduled-agents) for the full reference. | ||
|
|
||
| ## 4. Review the proposed skill changes | ||
|
|
||
| Each time the outer loop runs, it opens a pull request with proposed changes to the `triage-issue-local` companion skill. Review that PR the same way you would review any other code change: | ||
|
|
||
| * Does the proposed principle generalize correctly, or is it another specific rule in disguise? | ||
| * Is the change scoped to the companion skill rather than the shared base skill? | ||
| * Does it improve the agent's reasoning, or does it just patch one edge case? | ||
|
|
||
| Merge the PR when the improvement is sound. Close it if the proposal is too narrow or incorrect. | ||
|
|
||
| Over time, the companion skill accumulates a clear description of how your team thinks about issue triage — not as a list of rules, but as a set of principles that a new maintainer or a new agent can learn from. | ||
|
|
||
| ## Productivity tips | ||
|
|
||
| * **Give feedback in the right place** — The outer loop collects signals from GitHub. If you want the triage agent to learn from a correction, add a comment to the issue explaining why the original label was wrong. That comment becomes signal for the next outer-loop run. | ||
| * **Apply the pattern to other agent roles** — The [`update-pr-review`](https://github.com/warpdotdev/oz-for-oss/blob/main/.agents/skills/update-pr-review/SKILL.md) and [`update-dedupe`](https://github.com/warpdotdev/oz-for-oss/blob/main/.agents/skills/update-dedupe/SKILL.md) skills from `warpdotdev/oz-for-oss` apply the same outer-loop pattern to the PR reviewer and deduplication agents. | ||
| * **Don't skip the PR review** — The value of the outer loop is compounding improvement, not speed. A bad skill change that merges silently creates more work than it saves. Every proposed change should get a review. | ||
|
|
||
| ## Next steps | ||
|
|
||
| * [What is a software factory?](/agent-platform/cloud-agents/software-factory) — How the outer improvement loop fits into the full factory model. | ||
| * [Set up your software factory: from triage to PR](/guides/agent-workflows/set-up-a-software-factory) — The inner loop the outer loop improves. | ||
| * [Run a software factory in the cloud](/guides/agent-workflows/run-a-software-factory-in-the-cloud) — Move the loop to Oz for team-wide visibility. | ||
| * [Scheduled agents](/agent-platform/cloud-agents/triggers/scheduled-agents) — Full reference for running cloud agents on a cadence. | ||
| * [`warpdotdev/oz-for-oss`](https://github.com/warpdotdev/oz-for-oss) — The complete reference implementation including all outer-loop skills. | ||
| * [Skills](/agent-platform/capabilities/skills) — How skill files work in Warp and Oz. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to feedback on the location for this conceptual doc. Currently in Agents > Cloud Agents > Software Factory