From 8b9042d7c417a844f67bbbc4642c3a95c0896f68 Mon Sep 17 00:00:00 2001 From: NiceCode666 Date: Thu, 30 Apr 2026 14:19:33 +0800 Subject: [PATCH 1/4] feat: add OpenClaw native plugin integration Adds the plumbing required to install AmphiLoop as a native OpenClaw plugin (Format: openclaw, not Claude Code bundle). The bundled skill exposes /amphiloop_build in any OpenClaw chat surface, delegating every code-writing step to the built-in coding-agent skill via a TODO-protocol working directory. - openclaw.plugin.json: native plugin manifest - package.json + openclaw-entry.mjs: classifies AmphiLoop as openclaw format (otherwise OpenClaw falls back to Claude Code bundle detection via .claude-plugin/plugin.json) - extensions/openclaw-skill/amphiloop-build/SKILL.md: 5-phase pipeline with worker dispatch via .amphiloop/AGENT_BRIEF.md + TODOS.md - extensions/openclaw-skill/README.md: install / verify docs - CLAUDE.md: adds OpenClaw Integration section - .gitignore: ignores docs/superpowers/ (internal planning notes) --- .gitignore | 6 +- CLAUDE.md | 15 ++ extensions/openclaw-skill/README.md | 106 +++++++++ .../openclaw-skill/amphiloop-build/SKILL.md | 210 ++++++++++++++++++ openclaw-entry.mjs | 12 + openclaw.plugin.json | 15 ++ package.json | 10 + 7 files changed, 373 insertions(+), 1 deletion(-) create mode 100644 extensions/openclaw-skill/README.md create mode 100644 extensions/openclaw-skill/amphiloop-build/SKILL.md create mode 100644 openclaw-entry.mjs create mode 100644 openclaw.plugin.json create mode 100644 package.json diff --git a/.gitignore b/.gitignore index 7a7e2df..362627e 100644 --- a/.gitignore +++ b/.gitignore @@ -24,4 +24,8 @@ __pycache__/ .env # Node (if npx skills is used locally) -node_modules/ \ No newline at end of file +node_modules/ + + +# superpowers +docs/superpowers/ \ No newline at end of file diff --git a/CLAUDE.md b/CLAUDE.md index 89a5492..2078de3 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -79,3 +79,18 @@ claude plugin install AmphiLoop | Command | When to Use | |---------|-------------| | **/build** | Unified entry point. Turn any task into a working bridgic-amphibious project. Accepts an optional domain flag (`/build --browser`) to inject pre-distilled context from `domain-context//`. Without a flag, auto-detects the domain from `TASK.md` (or falls back to a generic flow). Users may additionally supply their own domain references in `TASK.md`. | + +## OpenClaw Integration + +The AmphiLoop repository **is** an OpenClaw native plugin. Installing it (`openclaw plugins install --link`) auto-registers a bundled skill that exposes `/amphiloop_build ""` in any OpenClaw chat surface. + +| Aspect | How it works | +|--------|--------------| +| **Plugin install** | `openclaw plugins install /abs/path/AmphiLoop-02 --link` then `openclaw gateway restart`. Setup + verification details in `extensions/openclaw-skill/README.md`. | +| **Native classification** | Three small files at repo root — `openclaw.plugin.json` (manifest), `package.json` (with `openclaw.extensions: ["./openclaw-entry.mjs"]`), and `openclaw-entry.mjs` (no-op entry) — make OpenClaw classify AmphiLoop as **native** (`Format: openclaw`) instead of falling back to Claude Code bundle detection from `.claude-plugin/plugin.json`. | +| **Bundled skill** | The plugin manifest declares `"skills": ["./extensions/openclaw-skill"]`; OpenClaw auto-discovers `amphiloop-build/SKILL.md` under that directory. | +| **Orchestration** | The OpenClaw host model drives Phases 2–3 (config + explore) directly, reading the methodology from `agents/amphibious-*.md` via the `{baseDir}/../..` path resolution. | +| **Code generation (host ↔ coding-agent)** | Host writes `/.amphiloop/AGENT_BRIEF.md` (lists the bridgic-* SKILL.md files the worker MUST read for correct API usage) + `/.amphiloop/TODOS.md` (task list). Then opens **one** long-lived OpenClaw `coding-agent` session and sends a tiny pointer prompt. Worker reads brief, reads TODOs, completes them, ticks `[ ]` to `[x]`. | +| **Verify-fix loop** | Phase 5 verify failures get **appended** to the same `TODOS.md` as new `[ ] FIX-N: ...` entries; host then sends a one-line "continue" to the same long-lived worker session. Up to 3 fix rounds. | +| **Worker choice** | The skill asks the user at run start which worker to dispatch to (`claude` recommended, plus `codex` / `opencode` / `pi`). One worker per run, reused throughout. | +| **Existing files** | All Claude Code-only artifacts (`hooks/`, `.claude-plugin/`, `commands/build.md`, `scripts/hook/`) remain in place. The new `package.json` + `openclaw-entry.mjs` + `openclaw.plugin.json` are the only repo-root additions; they coexist with the Claude Code manifest without conflict. | diff --git a/extensions/openclaw-skill/README.md b/extensions/openclaw-skill/README.md new file mode 100644 index 0000000..6e9a49c --- /dev/null +++ b/extensions/openclaw-skill/README.md @@ -0,0 +1,106 @@ +# AmphiLoop OpenClaw Skill + +Drop-in OpenClaw skill that exposes AmphiLoop as the slash command `/amphiloop_build`. The skill orchestrates AmphiLoop's 5-phase pipeline inside OpenClaw and delegates every code-writing step to OpenClaw's built-in `coding-agent` skill (a worker CLI of your choice — Claude Code, Codex, OpenCode, or Pi). Host and worker communicate via shared files in the working directory (`.amphiloop/AGENT_BRIEF.md` + `.amphiloop/TODOS.md`), not by stuffing a giant prompt. + +## Install model + +The AmphiLoop repository **is** the OpenClaw plugin. Mounting the repository as a plugin (one command) automatically registers this bundled skill: + +1. Clone the AmphiLoop repo somewhere on disk. +2. `openclaw plugins install --link` — see Install below. +3. The skill resolves the AmphiLoop repo root automatically using the OpenClaw `{baseDir}` macro (`{baseDir}/../..`). Users do not need to provide an AmphiLoop path. + +There is intentionally no auto-download / clawhub install path — the skill is colocated with the AmphiLoop methodology files (`agents/amphibious-*.md`, `scripts/run/*.sh`, `domain-context/*`, and the bridgic-* SDK skills under `skills/`) it needs at runtime, so they always travel together with the repo clone. + +## Dependencies + +You must have at least one of the following coding-agent worker CLIs installed and reachable on `PATH`: + +- **Claude Code** *(recommended)* — `npm install -g @anthropic-ai/claude-code` +- Codex — `npm install -g @openai/codex` +- OpenCode — see project docs +- Pi — `npm install -g @mariozechner/pi-coding-agent` + +OpenClaw's `coding-agent` skill must also be enabled in your OpenClaw config (`skills.entries.coding-agent.enabled: true`). + +## Install (recommended: as an OpenClaw plugin) + +```bash +# 1. Enable the built-in coding-agent skill we delegate to +openclaw config set skills.entries.coding-agent.enabled true --strict-json + +# 2. Install the AmphiLoop repo as a linked openclaw plugin +# (--link points at your local clone instead of copying — edits to +# SKILL.md / agents/* / scripts/* are picked up live) +openclaw plugins install /abs/path/to/AmphiLoop-02 --link + +# 3. Restart so the gateway loads the plugin +openclaw gateway restart +``` + +That registers the bundled skill `amphiloop-build` automatically — no `skills.load.extraDirs` entry needed. + +> **Note on plugin classification.** AmphiLoop ships both `openclaw.plugin.json` (OpenClaw native manifest) and `.claude-plugin/plugin.json` (the original Claude Code marker). For OpenClaw to classify the repo as a **native** plugin (not as a Claude Code bundle), it also needs `package.json` with `openclaw.extensions: ["./openclaw-entry.mjs"]` plus the tiny `openclaw-entry.mjs` no-op entry. Both files live at the AmphiLoop repo root and are committed. + +### Fallback: mount only this skill via `extraDirs` (no plugin) + +If you don't want a plugin install, you can mount just this skill directory: + +```bash +openclaw config set skills.load.extraDirs \ + '["/abs/path/to/AmphiLoop-02/extensions/openclaw-skill"]' \ + --strict-json --merge +openclaw gateway restart +``` + +The skill works the same; you just lose the `openclaw plugins enable/disable/inspect/list` lifecycle controls. + +## Verification + +```bash +# Plugin should be Format: openclaw, Status: loaded +openclaw plugins inspect amphiloop + +# Skill should be ✓ Ready +openclaw skills info amphiloop-build + +# Cross-check that coding-agent itself is also Ready +openclaw skills check 2>&1 | grep coding-agent +``` + +After both are ready, the slash command `/amphiloop_build` is live in any OpenClaw chat surface. + +## Usage + +In an OpenClaw chat: + +``` +/amphiloop_build "" +``` + +What happens next: + +1. The skill asks you to pick the coding worker for this run (`claude` / `codex` / `opencode` / `pi`). Reply with one word. +2. The skill asks for `` (where the generated project will live; offers a sensible default). +3. The skill drives Phases 2–3 (config + explore) directly using the OpenClaw host model. Outputs land at `/.bridgic/build_context.md` and `/.bridgic/exploration/exploration_report.md`. +4. **The skill writes the worker brief and TODO list** to `/.amphiloop/AGENT_BRIEF.md` and `/.amphiloop/TODOS.md`. The brief tells the worker which bridgic-* SKILL.md files to read so the API surface is correct; the TODO list is the work plan. +5. The skill opens **one** long-lived `coding-agent` session with the worker you picked, sends a tiny pointer prompt ("read AGENT_BRIEF.md, read TODOS.md, work through them"), and watches the worker tick TODOs to `[x]`. +6. Phase 5 verifies the generated project. If verification fails because of a code defect, the skill **appends new FIX entries to TODOS.md** and tells the worker (in the same long-lived session) to continue. Up to 3 fix rounds. +7. The skill closes the worker session and sends you a summary message with the project path and pass/fail status. + +## Design notes + +- **Communication channel is the working directory, not the prompt.** The kickoff prompt is ~200 chars and only points at `.amphiloop/AGENT_BRIEF.md` + `.amphiloop/TODOS.md`. Methodology, API references, and bug reports all flow through files. Benefit: worker isn't drowned in a giant context blob; host can monitor progress by re-reading TODOS.md; bug fixes are appended TODOs instead of fresh fix prompts. +- **Worker is forced to read the bridgic-* SKILL.md files.** The brief lists them as STEP 1 mandatory reads. Without this, the worker hallucinates APIs that don't exist in the bridgic-amphibious / bridgic-llms / bridgic-browser SDK. +- **Why a single long-lived session?** So the worker carries context from the initial generation into any follow-up fix. Restarting the worker per call would force it to re-derive everything from disk and risks stylistic drift. +- **Why ask the user for the worker?** Worker quality varies by task. Claude Code is the closest fit to AmphiLoop's coding methodology and is the recommended default; the others are available for users who prefer them. +- **Why does the skill never write code itself?** The host model (default Pi) is good at orchestration but weaker at sustained coding. All code production is routed to a worker that is purpose-built for it. +- **No write-conflict on TODOS.md.** Host writes to it only while the worker is sentinel-waiting (between turns); worker writes to it only while actively working. The sequential prompt/sentinel cycle enforces this. + +## Reference + +- AmphiLoop skill source: `extensions/openclaw-skill/amphiloop-build/SKILL.md` +- OpenClaw native plugin manifest: `/openclaw.plugin.json`, `/package.json`, `/openclaw-entry.mjs` +- OpenClaw built-in skill we delegate to: `/skills/coding-agent/SKILL.md` +- OpenClaw slash-command docs: `/docs/tools/slash-commands.md` +- OpenClaw plugin CLI: `openclaw plugins --help` diff --git a/extensions/openclaw-skill/amphiloop-build/SKILL.md b/extensions/openclaw-skill/amphiloop-build/SKILL.md new file mode 100644 index 0000000..15187fa --- /dev/null +++ b/extensions/openclaw-skill/amphiloop-build/SKILL.md @@ -0,0 +1,210 @@ +--- +name: amphiloop-build +description: Drive AmphiLoop's 5-phase pipeline inside OpenClaw. Host orchestrates and verifies; the built-in `coding-agent` skill writes all code. Host and coding-agent communicate via shared files in the working directory (`.amphiloop/AGENT_BRIEF.md` + `.amphiloop/TODOS.md`), not by stuffing big prompts. ONE long-lived worker session for the whole run, sequential. Worker (claude/codex/opencode/pi) is chosen by the user at run start. +user-invocable: true +metadata: + openclaw: + emoji: 🌊 + requires: + anyBins: ["claude", "codex", "opencode", "pi"] + config: ["skills.entries.coding-agent.enabled"] +--- + +# AmphiLoop Build (OpenClaw) + +Turn a task description into a runnable bridgic-amphibious Python project. + +**Division of labor**: + +- **Host (you)** — the brain: reasoning, planning, verifying. You prepare a working directory with `.amphiloop/AGENT_BRIEF.md` (what the worker must read) + `.amphiloop/TODOS.md` (what the worker must do). When verify finds bugs, you append them as new TODO entries. +- **`coding-agent` skill** — the hands: a worker (`claude`/`codex`/`opencode`/`pi`) reads the brief, reads the bridgic-* SKILL.md files it points to so the API is correct, then works through TODOS.md ticking items off as it goes. + +**Communication channel** is the working directory, not the prompt. The prompt to the worker stays short (~200 chars: "read AGENT_BRIEF.md, read TODOS.md, work through them"). Methodology, API references, and bug reports all flow through files. This avoids context overflow, lets the host monitor progress by re-reading TODOS.md, and forces the worker to actually read the bridgic skill SKILL.md files instead of hoping its prompt mentioned them. + +**Single long-lived worker session** for the whole run (strictly sequential) so the worker carries context from initial generation into any follow-up fix. + +## Inputs / Outputs + +- **Inputs**: + - ``: free-form natural-language task description from the user + - ``: working directory the project will live under (ask the user; create it if it does not exist) +- **Output**: `//{amphi.py, main.py, log/, result/}` + +## Path resolution + +This skill ships **inside** the AmphiLoop repository at `/extensions/openclaw-skill/amphiloop-build/`. The OpenClaw `{baseDir}` macro at runtime resolves to the directory containing this SKILL.md, so the AmphiLoop repo root is always `{baseDir}/../..`. The skill reads the methodology files under `{baseDir}/../../agents/` and helper scripts under `{baseDir}/../../scripts/run/` directly via that path — **do not ask the user for an AmphiLoop path**. + +## Mandatory flow + +Execute the steps in order. Never skip; never re-order. Throughout, **do not write or edit code yourself** — every code-touching action goes through `` (opened in Step E). + +### Step A. Pick the coding worker (must ask the user) + +Send the user exactly: + +> About to start AmphiLoop build. Pick the coding worker for this run: +> `claude` (recommended) | `codex` | `opencode` | `pi` (not recommended). +> Reply with one word. + +Wait for the reply. Record it as ``. Reuse `` for the entire build run; do not switch mid-run. + +If the user replies with anything other than the four valid options, ask again rather than guessing. + +### Step B. Prepare the working directory + +1. Confirm `` with the user; offer a sensible default (e.g., a fresh `mktemp -d`) if they have not specified one. The AmphiLoop repo path does **not** need to be asked — it is `{baseDir}/../..` by construction. +2. Use the `write` tool to write `` verbatim into `/TASK.md`. +3. Capture the OpenClaw notification route of the current conversation: `notifyChannel`, `notifyTarget`, `notifyAccount`, `notifyReplyTo`, `notifyThreadId`. You will need them later for Step G. + +This step does not write code; do it directly. + +### Step C. Phase 2 — Config (host runs this directly) + +1. `read` the file `{baseDir}/../../agents/amphibious-config.md` to load the Phase 2 methodology. +2. Following that methodology, read `/TASK.md`, decide pipeline mode and any domain context, and `write` the result to `/.bridgic/build_context.md`. + +This produces a markdown decision record, not code. Do it directly. + +### Step D. Phase 3 — Explore (host runs this directly) + +1. `read` the file `{baseDir}/../../agents/amphibious-explore.md` to load the Phase 3 methodology. +2. Following that methodology, use `bash` to observe the target environment (running existing tools, taking notes, capturing samples). `write` the consolidated observations to `/.bridgic/exploration/exploration_report.md`. + +Writing notes is not coding — do it directly. + +**Exception**: if the exploration genuinely needs a probe script (Python / JS / shell) to be authored to make further observations possible, treat that probe-script authorship as the **first** code-writing action of the run and jump to Step E0/E now (use the probe-script as the first TODO). + +### Step E0. Prepare the work template (★ v8 ★ host writes the brief and TODO list before any coding) + +Before opening any worker session, host must write two communication files into `/.amphiloop/`. These are the entire interface between host and worker for this run. + +1. **Write `/.amphiloop/AGENT_BRIEF.md`** — a static reference brief. Use the `write` tool. Recommended structure: + + ```markdown + # Worker brief + + You are doing the coding for an AmphiLoop bridgic-amphibious project build. + Working directory: + + ## STEP 1 — read the bridgic API surface FIRST (mandatory before any code) + + Use your file-read tool on each of these files in order. Do not skip any. + + - {baseDir}/../../skills/bridgic-amphibious/SKILL.md + - {baseDir}/../../skills/bridgic-llms/SKILL.md + - {baseDir}/../../skills/bridgic-browser/SKILL.md ← only if the task involves browser automation; skip otherwise + - {baseDir}/../../agents/amphibious-code.md ← the coding methodology you must follow + - {baseDir}/../../domain-context//code.md ← only if a matching domain context exists + + The bridgic-* SKILL.md files define the actual class names, method signatures, and APIs you must use. Inventing API surface that is not in those files will fail. + + ## STEP 2 — read this run's context + + - /TASK.md + - /.bridgic/build_context.md + - /.bridgic/exploration/exploration_report.md + + ## STEP 3 — work through TODOS.md + + Open /.amphiloop/TODOS.md. Pick the topmost open `[ ]` item, complete it, then EDIT TODOS.md in place to change its `[ ]` to `[x]`. Save. Move to the next open item. Repeat until no open items remain. + + ## STEP 4 — when all TODOs are done + + Print exactly this line on stdout and then wait for further input. DO NOT exit: + `### AMPHI-TASK-DONE ###` + + The orchestrator may append new `[ ]` items to TODOS.md later (e.g. fixes after verification). When you receive a "continue" instruction, re-open TODOS.md, find the new open items, and resume from STEP 3. + + ## Output layout + + Final deliverable goes under `//`: + amphi.py + main.py + log/ + result/ + ``` + + When writing this file, substitute real absolute paths for `{baseDir}/../..` and `` and `` (drop lines whose source files don't exist for the current run). + +2. **Write `/.amphiloop/TODOS.md`** — the initial Phase 4 task list. Use the `write` tool. Derive 5–8 items by mapping the Phase 1–4 sections of `{baseDir}/../../agents/amphibious-code.md` into checkboxes. Tailor wording to the current task. A typical seed: + + ```markdown + # AmphiLoop build TODOs + + - [ ] T1: Scaffold `/` via the bridgic-amphibious CLI (per skills/bridgic-amphibious/SKILL.md). Create empty log/ and result/ dirs. + - [ ] T2: In amphi.py, define the CognitiveContext for this task following build_context.md. + - [ ] T3: In amphi.py, implement on_workflow yielding ActionCalls that mirror the Operation Sequence in exploration_report.md. + - [ ] T4: In amphi.py, implement on_agent think_units for AMPHIFLOW fallback per the methodology. + - [ ] T5: Register task tools (FunctionToolSpec) for any domain-specific operations the workflow needs. + - [ ] T6: Implement helper functions for parsing VOLATILE refs from ctx.observation. + - [ ] T7: Write main.py with LLM init (per skills/bridgic-llms/SKILL.md), tools assembly, and the agent.arun(...) call. + - [ ] T8: Run `uv run main.py` once dry to confirm it boots without import or syntax errors. + ``` + +3. Send a short progress note to the user: "Worker brief and TODO list written to `/.amphiloop/`. Opening coding-agent session next." + +### Step E. Phase 4 — Code (★ open the long-lived coding-agent session with a SHORT prompt ★) + +This is the first code-writing action of the run (unless Step D opened the session for a probe). The goal: open one worker session, capture ``, and submit a tiny pointer prompt that hands the worker over to AGENT_BRIEF.md + TODOS.md. + +1. **Invoke the `coding-agent` skill.** Tell it: + - `Worker: ` + - `Workdir: ` (so the worker starts in the right place; `cd` into it via the spawn config) + - `Mode: INTERACTIVE` — launch the worker in REPL/interactive mode, **not** a one-shot. Concretely: `claude` must be launched **without** `--print`; `codex` **without** `exec`; `pi` and `opencode` in their REPL form. PTY rules and exact spawn flags are coding-agent's responsibility — do not hand-roll bash here. + - `Background: yes` (coding-agent's hard rule). + - `This is a long-lived orchestrated session.` Tell coding-agent: do **not** require the worker to self-notify the user via `openclaw message send` per task. The orchestrator (this skill) will summarize at Step G. This deviation from the standard Mandatory Pattern is sanctioned by coding-agent's own contract ("if you do not have a trustworthy notification route, say so and do not claim that completion will notify the user automatically"). + - `Capture the OpenClaw process sessionId returned by bash background:true and report it back so the orchestrator can remember it as .` + +2. Once you have ``, submit the **kickoff prompt** via `process action:submit sessionId: data:`. The prompt is short and is the SAME shape every time: + + > Working directory is ``. First, read `.amphiloop/AGENT_BRIEF.md` end-to-end and follow it (it tells you which SKILL.md files to read so you know the bridgic API surface, and which context files to read for this task). Then read `.amphiloop/TODOS.md` and work through every open `[ ]` item top-to-bottom, editing TODOS.md to change `[ ]` to `[x]` as you finish each one. When all items are `[x]`, print exactly `### AMPHI-TASK-DONE ###` on its own line and wait for further input. DO NOT exit or terminate. + + Do NOT paste methodology, build_context, or exploration data into the prompt — they are reachable from AGENT_BRIEF.md. + +3. Send a short progress note to the user before submitting ("Phase 4: handing TODOs to the worker — read TODOS.md to follow along"). + +4. Monitor with `process action:log sessionId:` until the sentinel `### AMPHI-TASK-DONE ###` appears. Optionally `read` `/.amphiloop/TODOS.md` periodically to watch `[x]` count rise. + +5. **Do NOT kill the session.** Continue to Step F. + +### Step F. Phase 5 — Verify (host runs verify; bugs flow back via TODOS.md ★) + +1. `read` the file `{baseDir}/../../agents/amphibious-verify.md` to load the Phase 5 methodology. +2. Run `{baseDir}/../../scripts/run/monitor.sh` against the generated project via `bash` (or follow whatever execution recipe the methodology prescribes for this run). Collect the output. +3. Decide: + - **Pass** — proceed to Step G. + - **Fail, root cause is in the generated code** (logic error, missing import, wrong API call, etc.): + - Send a short progress note to the user ("Phase 5 verify failed; appending FIX TODOs (attempt N/3) and asking worker to continue"). + - **Append** one or more FIX entries to `/.amphiloop/TODOS.md` (use `read` then `write` the full new content; the worker is sentinel-waiting and not touching the file right now, so there is no write conflict). Format each entry as: + ```markdown + - [ ] FIX-N: : + + ``` + Use a stable monotonic N across attempts (FIX-1, FIX-2, ...). + - Submit a one-line continue prompt to the **same** `` via `process action:submit sessionId: data:`. The prompt is literally: + > New FIX entries appended to `.amphiloop/TODOS.md`. Re-read TODOS.md and resume from the first open `[ ]` item. Same rules as before: tick each item to `[x]` as you finish, then print `### AMPHI-TASK-DONE ###` and wait. DO NOT exit. + - Monitor with `process action:log` until the sentinel reappears. + - Re-run verification (return to Step F.2). + - **Fail, root cause is NOT code** (missing env var, missing credential, network issue, missing input data) — fix it yourself with `bash` / `write`. Do not append a FIX TODO and do not submit to the worker. Re-run verification. +4. Cap fix attempts at 3. After 3 consecutive code-fix attempts that still fail, stop the loop and proceed to Step G with a `fail` status. + +### Step G. Cleanup and report + +1. Kill the long-lived worker session: `process action:kill sessionId:`. +2. Send a final summary to the user with `openclaw message send` (use the notification route captured in Step B). Include: + - Pass / fail status + - Path to the generated project (`//`) + - Number of coding-agent turns used (1 for the Phase 4 prompt, plus N for fix attempts) + - If `fail`: the last failure summary so the user knows what to investigate + +## Common constraints + +- **Never write code yourself.** All code-writing — Phase 4 generation, Phase 5 fixes, Phase 3 probe scripts, anything else — must go through `process:submit` to `` and the TODO list. Do not edit `.py` / `.ts` / `.sh` files with the host's `write` or `edit` tools. (`/.amphiloop/AGENT_BRIEF.md` and `TODOS.md` are written by the host — those are protocol files, not code.) +- **All worker direction flows through TODOS.md.** Methodology, API references, and bug reports go into `/.amphiloop/AGENT_BRIEF.md` and `/.amphiloop/TODOS.md`, not into the prompt. The kickoff prompt and continue prompt are deliberately tiny pointers to those files. +- **One worker, one sessionId, for the whole run.** `` is chosen once in Step A; `` is opened once in Step E (or earlier in a Step D probe) and reused throughout. +- **Strictly sequential, no concurrent file writes.** The worker handles one prompt at a time. The host writes to TODOS.md only while the worker is sentinel-waiting; the worker writes to TODOS.md only while it is actively working. This is enforced by the sequential prompt/sentinel cycle, so there is no concurrent edit conflict on the file. +- **Sentinel discipline.** Every prompt you submit ends with the requirement to print `### AMPHI-TASK-DONE ###` so you have a deterministic completion signal. If after a generous wait the sentinel has not appeared but the expected files exist and the worker output has been quiet, treat that as completion (sentinel missed) and proceed. +- **Verify the worker actually read the brief.** After the kickoff prompt, scan `process:log` for evidence the worker called its file-read tool on the bridgic-* SKILL.md files listed in AGENT_BRIEF.md. If it skipped them (jumped straight to coding), inject one corrective `process:submit`: "You skipped the brief. STOP and read `.amphiloop/AGENT_BRIEF.md` STEP 1 files now before any further code." This guards against the v6 failure mode of the worker writing wrong APIs. +- **Do not re-implement coding-agent.** Do not write `claude --print '...'` / `codex exec '...'` style bash here — coding-agent's SKILL.md owns spawn details (PTY, background, flags). This skill only tells coding-agent **what** to launch and **how** to drive it via `process:submit`. +- **Progress visibility.** Send a one-line progress note before each `process:submit` so the user can follow the run. +- **Notification deviation.** Tell coding-agent up front this is a long-lived orchestrated session and the orchestrator will summarize at Step G. Do not have the worker self-notify per task. diff --git a/openclaw-entry.mjs b/openclaw-entry.mjs new file mode 100644 index 0000000..e8bf44b --- /dev/null +++ b/openclaw-entry.mjs @@ -0,0 +1,12 @@ +import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry"; + +export default definePluginEntry({ + id: "amphiloop", + register: () => { + // No-op: AmphiLoop's behavior is entirely in the bundled skill at + // extensions/openclaw-skill/amphiloop-build/SKILL.md. This entry exists only + // so OpenClaw classifies AmphiLoop as a native plugin (via package.json's + // openclaw.extensions) instead of falling through to Claude Code bundle + // detection from .claude-plugin/plugin.json. + }, +}); diff --git a/openclaw.plugin.json b/openclaw.plugin.json new file mode 100644 index 0000000..3b79ad8 --- /dev/null +++ b/openclaw.plugin.json @@ -0,0 +1,15 @@ +{ + "id": "amphiloop", + "name": "AmphiLoop", + "description": "Drive AmphiLoop's 5-phase pipeline inside OpenClaw via /amphiloop_build; delegates every code-writing step to the built-in coding-agent skill via a TODO-protocol working directory.", + "version": "1.0.0", + "activation": { + "onStartup": false + }, + "skills": ["./extensions/openclaw-skill"], + "configSchema": { + "type": "object", + "additionalProperties": false, + "properties": {} + } +} diff --git a/package.json b/package.json new file mode 100644 index 0000000..1a34dfe --- /dev/null +++ b/package.json @@ -0,0 +1,10 @@ +{ + "name": "amphiloop-openclaw-plugin", + "version": "1.0.0", + "private": true, + "type": "module", + "description": "OpenClaw plugin manifest for AmphiLoop. Required so OpenClaw classifies AmphiLoop as a native plugin instead of falling back to Claude Code bundle detection (.claude-plugin/plugin.json). All actual logic lives in the bundled skill at extensions/openclaw-skill/amphiloop-build/.", + "openclaw": { + "extensions": ["./openclaw-entry.mjs"] + } +} From 2e8f73fd5ee70818e6cdef57ed04e6d747a7ac0d Mon Sep 17 00:00:00 2001 From: NiceCode666 Date: Thu, 30 Apr 2026 14:19:58 +0800 Subject: [PATCH 2/4] docs: Claude Code documentation adjustments for OpenClaw flow mirror (A1-A10) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Foundation for migrating AmphiLoop from a Claude Code-only plugin to a dual-platform plugin. Makes commands/build.md + agents/amphibious-*.md the canonical source of truth for pipeline methodology so a future PR can reduce the OpenClaw SKILL.md to a thin wrapper that delegates phase methodology to those files. A1: tool-neutral question wording in amphibious-config.md so OpenClaw (no AskUserQuestion equivalent) and Claude Code follow identical semantics. Forbids duplicating the question as chat text alongside the tool call. A2: explicit canonical exploration output dir /.bridgic/explore/ in amphibious-explore.md (was implicit; OpenClaw skill had drifted to .bridgic/exploration/). A3 + A4: build_context.md refresh ownership moved from commands/build.md (orphaned paragraph) into amphibious-explore.md and amphibious-code.md as explicit closing sections of each agent's contract. A5: orphaned 'After Phases 3 and 4, refresh build_context.md' paragraph in commands/build.md removed (work now owned by agents). A6: amphibious-verify.md Phase 4.2 syntax check uses 'uv run --project python -m py_compile' so it runs against the same Python interpreter the project's uv env was set up with. A7: .claude-plugin/plugin.json registers all 3 dispatchable agents (amphibious-explore, amphibious-code, amphibious-verify). amphibious-config.md is intentionally omitted (interactive, runs inline) — a frontmatter note documents that. A8: amphibious-code.md adds a Best Practice in Phase 2.3 forbidding wait_for(time_seconds=N) for user-action waits (login, QR scan, CAPTCHA). HumanCall is the correct primitive — bridgic framework blocks the yield until the user responds. A9: amphibious-explore.md makes the runtime mapping HUMAN: -> HumanCall explicit in the Operation Sequence section so explorers don't record wait_for fallbacks. A10: amphibious-verify.md gains an OpenClaw addendum describing the sentinel + flag-file bridge protocol the long-lived coding-agent worker uses to surface monitor.sh exit-2 (human input required) events to the host orchestrator. No effect on Claude Code subagent execution. --- .claude-plugin/plugin.json | 4 +++- agents/amphibious-code.md | 19 +++++++++++++++++-- agents/amphibious-config.md | 12 ++++++++---- agents/amphibious-explore.md | 15 ++++++++++++++- agents/amphibious-verify.md | 22 +++++++++++++++++++++- commands/build.md | 5 ----- 6 files changed, 63 insertions(+), 14 deletions(-) diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json index 53dc20c..3ef507c 100644 --- a/.claude-plugin/plugin.json +++ b/.claude-plugin/plugin.json @@ -6,7 +6,9 @@ "./commands/" ], "agents": [ - "./agents/amphibious-code.md" + "./agents/amphibious-explore.md", + "./agents/amphibious-code.md", + "./agents/amphibious-verify.md" ], "skills": [] } diff --git a/agents/amphibious-code.md b/agents/amphibious-code.md index e1fc912..d19eef0 100644 --- a/agents/amphibious-code.md +++ b/agents/amphibious-code.md @@ -156,9 +156,16 @@ An async generator that yields `ActionCall` / `AgentCall` / `HumanCall`. Transla yield HumanCall(prompt="Confirm before deleting?") # Human-only ``` -5. **Compute dynamic values at runtime.** Relative phrases in the task description ("past 7 days", "today", "last 30 days") must be computed inside the generator with `datetime` etc., not hardcoded at write time. +5. **`HumanCall` vs `wait_for` — strict separation.** Two different waits exist; do not confuse them. -6. **Keep generator-internal logic minimal.** Code between yields runs in the generator body. **If it raises, the generator is unrecoverable** — `asend()` cannot resume past an exception, so AMPHIFLOW skips per-step retry and jumps directly to full `on_agent` fallback. Keep inline code to variable assignment and pure helpers; push risky operations (network calls, parsing untrusted input) into `ActionCall`-wrapped tools where they can be retried. + - **Waiting for the UI to settle** (page render, click reaction, animation): use `yield ActionCall("wait_for", time_seconds=N)` or condition-based `wait_for(text=..., text_gone=..., selector=...)`. Time-bounded. + - **Waiting for the user to act** (login, QR-code scan, CAPTCHA solve, destructive-action confirmation): use `yield HumanCall(prompt="...")`. The bridgic framework **truly blocks** that yield until a human response arrives. You do not — and must not — guess how long the user will take. + + **Forbidden**: using `wait_for(time_seconds=N)` (any N) to wait for a user action. User logins can take 5 seconds or 5 minutes; a fixed timer either fails too fast or wastes time. Any exploration step tagged `HUMAN:` MUST map to `HumanCall` in the generated code, never to `wait_for`. + +6. **Compute dynamic values at runtime.** Relative phrases in the task description ("past 7 days", "today", "last 30 days") must be computed inside the generator with `datetime` etc., not hardcoded at write time. + +7. **Keep generator-internal logic minimal.** Code between yields runs in the generator body. **If it raises, the generator is unrecoverable** — `asend()` cannot resume past an exception, so AMPHIFLOW skips per-step retry and jumps directly to full `on_agent` fallback. Keep inline code to variable assignment and pure helpers; push risky operations (network calls, parsing untrusted input) into `ActionCall`-wrapped tools where they can be retried. ### 2.4 `on_agent` — only for `AGENT` or `AMPHIFLOW` @@ -321,3 +328,11 @@ if __name__ == "__main__": 6. **No `config.py` by default.** Inline `os.getenv` in main.py. Split into a `config.py` only if env loading grows complex (many vars, validation, defaults). --- + +## Update build_context.md + +After Phase 4 completes, edit `/.bridgic/build_context.md`: +1. Replace the `## Outputs → generator_project` placeholder line `generator_project: (filled by Phase 4)` with the absolute path to `//`. +2. Refresh the `env_ready:` block: read `/pyproject.toml` and replace the content under `--- pyproject.toml ---` with its current text. This keeps Phase 5 (verify) accurate about which packages are installed. + +--- diff --git a/agents/amphibious-config.md b/agents/amphibious-config.md index 33b4213..27d2781 100644 --- a/agents/amphibious-config.md +++ b/agents/amphibious-config.md @@ -13,6 +13,8 @@ tools: ["AskUserQuestion", "Bash", "Read", "Write"] # Amphibious Config Agent +> **Not a dispatchable subagent.** This agent is interactive (uses `AskUserQuestion` / equivalent ask-the-user mechanism) and runs **inline** in the calling command's thread. Do not register it under `.claude-plugin/plugin.json` `agents:` — only `amphibious-explore`, `amphibious-code`, and `amphibious-verify` are dispatchable. + You are a build-pipeline configuration specialist. Your job is to interactively determine project-mode / LLM / domain-specific settings, run environment setup, and write the consolidated `build_context.md` that every later agent reads. ## Input @@ -33,7 +35,9 @@ This agent runs interactively from the very first step; there are no startup fil ## Step 1: Project Mode -Present via `AskUserQuestion`: +**Ask the user** with these exact options. Use the platform's structured-question tool if one is available (e.g. Claude Code's `AskUserQuestion`); otherwise send the question as a single message and wait for the user's reply. **Do not also emit the question as chat text alongside the tool call** — the question is sent once. + +Question: > Choose project mode: > @@ -54,13 +58,13 @@ Decide whether to set up LLM — set `llm_configured` to `yes` or `no`. ``` - Exit 0: variables present — proceed. - - Exit 1: list missing variables; create `.env`, ask the user to fill it, re-run the check; do not proceed until exit 0. + - Exit 1: use the same ask-the-user mechanism as Step 1. Tell the user the missing variables and ask whether to (a) write a `.env` skeleton for them to fill, or (b) wait while they `export` the vars in their shell. Then re-run `check-dotenv.sh` until exit 0; do not proceed until exit 0. Set `llm_configured = yes`. - **If `project_mode == workflow`**: analyze the task description. - - **If task contains AI-suggestive operations** (e.g. "extract key information", "analyze content", "generate a report"), ask via `AskUserQuestion`: + - **If task contains AI-suggestive operations** (e.g. "extract key information", "analyze content", "generate a report"), ask using the same mechanism as Step 1: > Your task description mentions operations that may benefit from AI/LLM capabilities (e.g. content analysis, intelligent extraction). Configure an LLM? > @@ -74,7 +78,7 @@ Decide whether to set up LLM — set `llm_configured` to `yes` or `no`. ## Step 3: Domain-specific Configuration -If `SELECTED_DOMAIN` is resolved AND `{PLUGIN_ROOT}/domain-context//config.md` exists, read that file and follow its instructions verbatim — it tells you which questions to ask the user (still via `AskUserQuestion`) and which keys to record. Capture each answer as `domain_config[] = `. +If `SELECTED_DOMAIN` is resolved AND `{PLUGIN_ROOT}/domain-context//config.md` exists, read that file and follow its instructions verbatim — it tells you which questions to ask the user (using the same ask-the-user mechanism as Step 1) and which keys to record. Capture each answer as `domain_config[] = `. If no `config.md` exists, skip this step and treat `domain_config` as empty. diff --git a/agents/amphibious-explore.md b/agents/amphibious-explore.md index 1ea373f..4536879 100644 --- a/agents/amphibious-explore.md +++ b/agents/amphibious-explore.md @@ -99,6 +99,8 @@ When you encounter a handoff during exploration, you **MUST** request human: - **Request** specific human intervention. - **Resume** exploration from the same point once the human confirms the obstacle is cleared. +Each `HUMAN:` step in the Operation Sequence will map **one-to-one** to a `HumanCall` yield in the generated code (see `amphibious-code.md` Phase 2.3). The framework blocks at that yield until the user responds — you do not need to estimate how long the user will take, and you must **not** record a fallback `wait_for(time_seconds=...)` to "give the user time". + Finally, record only the **minimal chain of operations** needed to achieve the goal. Exclude: - Observation commands (they happen on every step; they are not part of the plan). @@ -128,7 +130,13 @@ After exploration, run the cleanup protocol recorded in the Domain Guidance to r ## Generate Report -Write `exploration_report.md` plus all saved artifact files. The report has **up to three sections** — §1 is optional, §3 is omitted when no volatile data was captured. +Write all outputs into `/.bridgic/explore/`: +- `exploration_report.md` — the report itself +- artifact files (e.g. `list_state.txt`, `detail_state.txt`) at the same directory level + +Do not nest under any further subdirectory. The path `/.bridgic/explore/` is the canonical location read by `amphibious-code.md` Phase 3 and `amphibious-verify.md`. + +The report has **up to three sections** — §1 is optional, §3 is omitted when no volatile data was captured. ### 1. Domain Guidance @@ -206,3 +214,8 @@ A pseudocode-style list. Use indentation and control-flow keywords (`FOR`, `WHIL ### 3. Artifact Files List saved artifact paths. Each entry annotates **what extractable content** the file contains — enough for a reader to know which file documents which volatile data without opening every one. + +## Update build_context.md + +After writing the report and artifacts, edit `/.bridgic/build_context.md`: +1. Replace the `## Outputs → exploration_report` placeholder line `exploration_report: (filled by Phase 3)` with the absolute path to `exploration_report.md` (e.g. `exploration_report: /abs/path/.bridgic/explore/exploration_report.md`). diff --git a/agents/amphibious-verify.md b/agents/amphibious-verify.md index 446fcd6..0d24718 100644 --- a/agents/amphibious-verify.md +++ b/agents/amphibious-verify.md @@ -184,9 +184,11 @@ Search all `.py` files in the project for `# --- VERIFY_ONLY_BEGIN ---` and `# - ### 4.2 Final Syntax Check ```bash -find -name "*.py" -exec python -m py_compile {} + +find -name "*.py" -exec uv run --project "" python -m py_compile {} + ``` +`` is the parent uv workspace (the directory holding `pyproject.toml`); `` is the generator project directory under it. Using `uv run --project` ensures the syntax check runs against the same Python interpreter the project's uv env was set up with — bare `python` may pick up a different version and yield false positives. + Confirm all files still compile after marker removal. --- @@ -198,3 +200,21 @@ Report back to the calling command: - **Summary**: What was verified and how - **Issues found and fixed**: Any code fixes applied during verification - **Human interventions**: Any points where human action was required + +--- + +## OpenClaw addendum — human-intervention bridge + +Under OpenClaw the verify-fix loop runs inside the long-lived coding-agent worker, not in this agent's own context. The worker has no direct user-facing channel — only the host orchestrator does. When `monitor.sh` exits with code 2, the worker MUST follow the bridge protocol below: + +1. Read `/.bridgic/verify/human_request.json` to obtain the prompt text. +2. Write that prompt verbatim to `/.amphiloop/HUMAN_PROMPT.txt`. +3. Print exactly this line to stdout: `### AMPHI-HUMAN-REQUEST ###` +4. Poll `/.amphiloop/HUMAN_REPLY.txt` every 2 seconds. When it appears, read its contents. +5. Write `{"response": ""}` to `/.bridgic/verify/human_response.json`. +6. Delete both `.amphiloop/HUMAN_REPLY.txt` and `.amphiloop/HUMAN_PROMPT.txt`. +7. Re-invoke `monitor.sh` with the same arguments — it auto-resumes the still-running PID. + +Under Claude Code (when this agent runs as a subagent, not as worker code), the bridge is unnecessary — the agent uses its own tooling to ask the user and writes `human_response.json` directly. + +The bridge protocol is re-entrant: a single Phase 5 run may hit multiple HUMAN_REQUEST cycles (e.g. login then CAPTCHA). Same two filenames each time; the worker deletes them after consuming. diff --git a/commands/build.md b/commands/build.md index 8b8b85b..16980fe 100644 --- a/commands/build.md +++ b/commands/build.md @@ -39,11 +39,6 @@ Anything else in `$ARGUMENTS` (extra tokens, multiple flags) → stop and ask th > **build_context_path** — always `{PROJECT_ROOT}/.bridgic/build_context.md`. > **domain_context_path** — `{PLUGIN_ROOT}/domain-context//.md` when `SELECTED_DOMAIN` is resolved, otherwise the literal `none` (generic flow). `` is `explore.md` for Phase 3, `code.md` for Phase 4, `verify.md` for Phase 5. -After Phases 3 and 4, refresh `build_context.md` in two places: - -1. **Outputs** — replace the matching `(filled by Phase N)` placeholder with the phase's primary output path. -2. **env_ready** — read `{PROJECT_ROOT}/pyproject.toml` and update the dump under `--- pyproject.toml ---` inside the `env_ready:` block with its current contents. - --- ## Phase 1: Initialize Task From 585ed08c6137503dd9888f81ae30ea50c33268aa Mon Sep 17 00:00:00 2001 From: NiceCode666 Date: Fri, 1 May 2026 12:14:37 +0800 Subject: [PATCH 3/4] docs: enhance human interaction protocol and output layout clarity Updated the human interaction protocol across multiple agents to ensure consistent user prompts and responses during the build pipeline. Added detailed instructions for handling human intervention scenarios, emphasizing the importance of explicit user engagement at every phase transition. Revised output layout specifications in amphibious-code.md and related documents to clarify the directory structure and enforce mandatory placement of generated files within the project directory. Introduced anti-patterns to avoid, ensuring users adhere to best practices in project organization. These changes aim to streamline user experience and maintain clarity in project structure across both Claude Code and OpenClaw platforms. --- agents/amphibious-code.md | 22 +- agents/amphibious-config.md | 17 ++ agents/amphibious-explore.md | 12 +- agents/amphibious-verify.md | 28 ++- agents/human-interaction-protocol.md | 214 ++++++++++++++++++ commands/build.md | 15 ++ .../openclaw-skill/amphiloop-build/SKILL.md | 97 ++++++-- 7 files changed, 363 insertions(+), 42 deletions(-) create mode 100644 agents/human-interaction-protocol.md diff --git a/agents/amphibious-code.md b/agents/amphibious-code.md index d19eef0..a87bbef 100644 --- a/agents/amphibious-code.md +++ b/agents/amphibious-code.md @@ -37,7 +37,7 @@ Skill files (see Skill References below) and `## References` stay on-demand — ## Output Layout -The agent installs its runtime dependencies into PROJECT_ROOT's uv env (creating it if absent) and produces a code-only `/` subdirectory. The structure inside `/` may follow the pattern below: +The agent installs its runtime dependencies into PROJECT_ROOT's uv env (creating it if absent) and produces a code-only `/` subdirectory. The structure inside `/` **MUST match this layout exactly** — deviations break Phase 5 verify and downstream orchestration: ``` / @@ -45,14 +45,26 @@ The agent installs its runtime dependencies into PROJECT_ROOT's uv env (creating ├── uv.lock # resolution lockfile ├── .venv/ # uv-managed virtualenv ├── .env # only when llm_configured = yes -└── / # this agent's generator_project — code only +├── .bridgic/ # orchestrator workspace (build_context.md, explore/, verify/) — DO NOT write code here +└── / # this agent's generator_project — ALL generated code lives inside here ├── amphi.py # scaffold-created; this agent edits it ├── main.py # this agent creates: entry point (LLM init + agent.arun) ├── README.md # short, operational ├── log/ # runtime logs land here (configured in main.py) - └── result/ # task outputs land here + ├── result/ # task outputs land here + └── .py # any extra helpers/modules go here too — never at PROJECT_ROOT ``` +**`/` is the project's own directory, not a Python import package.** The entry points (`amphi.py`, `main.py`) and the support modules all live **inside** `/`. PROJECT_ROOT only carries uv metadata (`pyproject.toml`, `uv.lock`, `.venv/`, `.env`) and the orchestrator's `.bridgic/` workspace — never code. + +### Layout anti-patterns — never produce these + +- ❌ `amphi.py` / `main.py` placed at `/` (sibling of `pyproject.toml`) instead of inside `/`. The entry points must be reached as `//main.py`. +- ❌ Treating `/` as a Python import package (adding `__init__.py`, importing it from a sibling `main.py` at PROJECT_ROOT). `/` is the project root, not a package alongside other code. +- ❌ Writing project code (`*.py`, `log/`, `result/`) under `/.bridgic/`. That directory is the orchestrator's workspace and is exclusive to `build_context.md`, `explore/`, `verify/`. +- ❌ Creating `log/` or `result/` at PROJECT_ROOT instead of inside `/`. They must sit next to `main.py` so `Path(__file__).parent / "log"` resolves correctly. +- ❌ Splitting support modules (`config.py`, `tools.py`, helper files) out to PROJECT_ROOT. Any module imported by `amphi.py` or `main.py` must live inside `/`. + --- ## Phase 1: Initialize Project Skeleton @@ -73,11 +85,15 @@ bash "{PLUGIN_ROOT}/skills/bridgic-amphibious/scripts/install-deps.sh" \ ### 1.3 Scaffold `amphi.py` +`cd` into `/` **first**, then run the scaffolder. The cwd at the moment of `bridgic-amphibious create` is what determines where `amphi.py` lands — running it from `/` will drop the file at PROJECT_ROOT and violate the Output Layout. + ```bash cd "/" uv run bridgic-amphibious create --task "" ``` +After the command returns, verify with `ls "//amphi.py"` — if `amphi.py` is missing or sitting at `/amphi.py` instead, stop and move it before continuing. + ### 1.4 Create runtime directories ```bash diff --git a/agents/amphibious-config.md b/agents/amphibious-config.md index 27d2781..7021ad5 100644 --- a/agents/amphibious-config.md +++ b/agents/amphibious-config.md @@ -17,6 +17,8 @@ tools: ["AskUserQuestion", "Bash", "Read", "Write"] You are a build-pipeline configuration specialist. Your job is to interactively determine project-mode / LLM / domain-specific settings, run environment setup, and write the consolidated `build_context.md` that every later agent reads. +Every user-facing prompt in this document follows `{PLUGIN_ROOT}/agents/human-interaction-protocol.md`. Inside Claude Code you are running inline in `/build`'s thread (Tier 1 — use `AskUserQuestion`); inside OpenClaw the host follows this same methodology in Tier 2 (chat message + await textual reply). The question content below is identical across both; only the transport differs. + ## Input The calling command passes the inputs already established in Phase 1 of `/build`: @@ -85,6 +87,21 @@ If no `config.md` exists, skip this step and treat `domain_config` as empty. ## Step 4: Environment Setup +### 4.0 Side-effect checkpoint (before running any setup script) + +Steps 1–3 only collected decisions; nothing on disk has been mutated yet beyond the `.env` skeleton (if Step 2 wrote one). Step 4.1 is the **first script that touches the user's toolchain** — it may install `uv` to PATH, create `pyproject.toml`, and otherwise alter PROJECT_ROOT in ways the user might want to see coming. + +Before invoking `setup-env.sh`, ask the user via the Human Interaction Protocol (Tier 1 in Claude Code, Tier 2 in OpenClaw — same content, different transport): + +> About to run environment setup against `{PROJECT_ROOT}`: +> - verify the `uv` toolchain is on PATH (auto-install if missing) +> - run `uv init --bare` if no `pyproject.toml` exists yet +> +> **1. Run setup now** — proceed with `setup-env.sh`. +> **2. Pause** — I want to inspect or change something first. + +On **1** continue to 4.1. On **2** wait for the user's follow-up, then re-prompt. + ### 4.1 uv toolchain + PROJECT_ROOT uv project ```bash diff --git a/agents/amphibious-explore.md b/agents/amphibious-explore.md index 4536879..2e98897 100644 --- a/agents/amphibious-explore.md +++ b/agents/amphibious-explore.md @@ -8,7 +8,7 @@ description: >- re-observed each time the plan is carried out). Produces a pseudocode operation sequence with inline stability annotations plus any key-artifact files capturing the observed states the plan references. -tools: ["Bash", "Read", "Grep", "Write", "Edit"] +tools: ["AskUserQuestion", "Bash", "Read", "Grep", "Write", "Edit"] --- # Amphibious Explore Agent @@ -94,10 +94,14 @@ To record loops and branches faithfully, you must **probe their boundaries and a Secondly, mark **human handoffs** — points where the task requires intervention that automation cannot resolve alone (authentication wall, CAPTCHA, destructive-confirm dialog, permissions you lack, ambiguous UI, unexpected error state). Record each as a `HUMAN:` step in the plan, describing what the human must do and the signal to resume. -When you encounter a handoff during exploration, you **MUST** request human: +When you encounter a handoff *during exploration itself* (e.g. the target site shows a login wall and you cannot probe further until the user logs in), you **MUST** ask the user following `{PLUGIN_ROOT}/agents/human-interaction-protocol.md`. Pick the highest tier your current runtime supports: -- **Request** specific human intervention. -- **Resume** exploration from the same point once the human confirms the obstacle is cleared. +- **Claude Code subagent (Tier 1).** This agent's `tools:` declares `AskUserQuestion`. Use it: ask one focused question that names exactly what the user must do (e.g. "Please log into in the open browser, then choose 1. `done` / 2. `cancel exploration`."). Wait for the structured reply before resuming. +- **OpenClaw host running this methodology directly (Tier 2).** No `AskUserQuestion` here, but you have the chat / message channel captured by the host (`notifyChannel` / `notifyTarget`). Send a clearly formatted chat message that begins with `[USER ACTION REQUIRED]`, states exactly what to do (open which URL, click what, paste which token), and tells the user how to reply (`reply "done" once login completes`, or `reply with the token`). Wait for the user's textual reply before resuming. +- **Subagent without `AskUserQuestion` and no chat channel (Tier 3 — fallback).** Stop work and return a structured "human input needed" status to the calling command — include the prompt text and the resume signal. The calling command runs in Tier 1 or Tier 2 and asks on your behalf, then re-dispatches you with the answer. +- **Forbidden anti-pattern (all tiers).** Do **not** fall back to `echo "please do X" + until [ -f /tmp/flag ]; do sleep 3; done`. The user sees only a silent "Running" indicator and has no idea what is being asked. This is the canonical violation called out in the protocol. + +Once the user confirms the obstacle is cleared, **resume** exploration from the same point. Each `HUMAN:` step in the Operation Sequence will map **one-to-one** to a `HumanCall` yield in the generated code (see `amphibious-code.md` Phase 2.3). The framework blocks at that yield until the user responds — you do not need to estimate how long the user will take, and you must **not** record a fallback `wait_for(time_seconds=...)` to "give the user time". diff --git a/agents/amphibious-verify.md b/agents/amphibious-verify.md index 0d24718..b64433c 100644 --- a/agents/amphibious-verify.md +++ b/agents/amphibious-verify.md @@ -13,6 +13,8 @@ tools: ["Bash", "Read", "Grep", "Glob", "Write", "Edit"] You are a verification specialist for bridgic-amphibious projects. Your job is to take an already-generated project, verify it runs correctly end-to-end, and return clean production code. +Verify is the **last construction-phase step** — the orchestrator's first instrumented test-run of the freshly frozen code, before it is handed off to the user for production runs. Every user-facing prompt in this document (relaying a runtime `HumanCall`, surfacing a fatal error, deciding whether to retry) follows `{PLUGIN_ROOT}/agents/human-interaction-protocol.md`. The runtime file-bridge in Phase 1.2 is a *transport* that delivers prompts from the running program up to the orchestrator; once the prompt arrives here, this agent (or its OpenClaw-side host counterpart) is responsible for the Tier 1 / Tier 2 ask the protocol mandates. + ## Input The calling command passes exactly two absolute paths: @@ -151,7 +153,7 @@ bash {PLUGIN_ROOT}/scripts/run/monitor.sh {generator_project} [TIMEOUT] |------|---------|--------------| | **0** | Finished cleanly | Proceed to Phase 3 | | **1** | Finished with errors | Diagnose from stdout (last 50 log lines of `run.log`), fix code, re-run `monitor.sh` | -| **2** | Human intervention required | Read the prompt from stdout, ask the user, write the answer to the `human_response` path printed in stdout as `{"response": ""}`, re-run `monitor.sh` | +| **2** | Human intervention required | Read the prompt from stdout, **ask the user via the Human Interaction Protocol** (Tier 1 `AskUserQuestion` if available; otherwise Tier 2 / escalate per the protocol — never silent polling), write the answer to the `human_response` path printed in stdout as `{"response": ""}`, re-run `monitor.sh` | | **3** | Timeout | Report to user and investigate | The script calls `uv run python main.py`; the script returns only when an actionable event occurs. Re-invoke with the **same arguments** to resume — it auto-detects the existing PID after human intervention, or starts fresh after a terminal exit. The script owns every runtime artifact (`run.log`, `pid`, `human_request.json`, `human_response.json`) and prints the resolved absolute paths to stdout on every exit, so that the agent can interact with them to reason next steps or communicate with the user. @@ -203,18 +205,20 @@ Report back to the calling command: --- -## OpenClaw addendum — human-intervention bridge +## OpenClaw addendum — host-side human-intervention flow + +Under OpenClaw, **the host orchestrator is the one running `monitor.sh`** (per `extensions/openclaw-skill/amphiloop-build/SKILL.md` Step F.2) and is therefore the natural Tier 2 endpoint per the Human Interaction Protocol. The host has a chat / message channel to the user; the worker (coding-agent) does not. + +When `monitor.sh` exits with code 2: -Under OpenClaw the verify-fix loop runs inside the long-lived coding-agent worker, not in this agent's own context. The worker has no direct user-facing channel — only the host orchestrator does. When `monitor.sh` exits with code 2, the worker MUST follow the bridge protocol below: +1. Host reads `/.bridgic/verify/human_request.json` to obtain the prompt text. +2. Host asks the user via **Tier 2** (free-text chat message): begin with a `[USER ACTION REQUIRED]` marker, paste the prompt text, state how to reply (`reply with the answer text, or "done" once you've completed the action`). +3. Host waits for the user's textual reply. +4. Host writes `{"response": ""}` to `/.bridgic/verify/human_response.json`. +5. Host re-invokes `monitor.sh` with the same arguments — it auto-resumes the still-running PID. -1. Read `/.bridgic/verify/human_request.json` to obtain the prompt text. -2. Write that prompt verbatim to `/.amphiloop/HUMAN_PROMPT.txt`. -3. Print exactly this line to stdout: `### AMPHI-HUMAN-REQUEST ###` -4. Poll `/.amphiloop/HUMAN_REPLY.txt` every 2 seconds. When it appears, read its contents. -5. Write `{"response": ""}` to `/.bridgic/verify/human_response.json`. -6. Delete both `.amphiloop/HUMAN_REPLY.txt` and `.amphiloop/HUMAN_PROMPT.txt`. -7. Re-invoke `monitor.sh` with the same arguments — it auto-resumes the still-running PID. +This is re-entrant: a single Phase 5 run may hit multiple cycles (login, then CAPTCHA, then a confirmation dialog). Repeat the loop each time exit code 2 reappears. -Under Claude Code (when this agent runs as a subagent, not as worker code), the bridge is unnecessary — the agent uses its own tooling to ask the user and writes `human_response.json` directly. +### Under Claude Code (no OpenClaw) -The bridge protocol is re-entrant: a single Phase 5 run may hit multiple HUMAN_REQUEST cycles (e.g. login then CAPTCHA). Same two filenames each time; the worker deletes them after consuming. +When this agent runs as a Claude Code subagent (not as worker code), the OpenClaw addendum does not apply — the agent uses `AskUserQuestion` (or escalates per the protocol's Tier 3) and writes `human_response.json` directly. diff --git a/agents/human-interaction-protocol.md b/agents/human-interaction-protocol.md new file mode 100644 index 0000000..379d0c4 --- /dev/null +++ b/agents/human-interaction-protocol.md @@ -0,0 +1,214 @@ +--- +name: human-interaction-protocol +description: >- + Shared methodology for any AmphiLoop orchestrator or agent that needs to + pause and ask the human user during the build pipeline (Phases 1–5 of + /build, equivalently Steps B–F of the OpenClaw amphiloop-build skill). + Defines a capability-tiered fallback (structured ask tool → free-text + channel → escalate to parent), the phase-checkpoint pattern that keeps the + user in control between major pipeline steps, and the anti-patterns + (notably silent Bash polling) that violate the contract. +--- + +# Human Interaction Protocol + +Every AmphiLoop pipeline shares one non-negotiable contract: + +> The user must always **see** what is being asked, must always **explicitly +> reply** before the run advances, and must always be able to **redirect** at +> phase boundaries. No silent waiting. No timeout-driven auto-continue. + +This document is the single source of truth. `commands/build.md`, every +`agents/amphibious-*.md`, and `extensions/openclaw-skill/amphiloop-build/SKILL.md` +all defer to the rules below. + +## Scope — what this protocol governs + +The protocol governs **orchestrator-driven, build-pipeline user interaction**: + +- Claude Code: the `/build` command and every agent it dispatches — + Phase 1 (Init) → Phase 2 (Config) → Phase 3 (Explore) → Phase 4 (Code) → + Phase 5 (Verify, including the first instrumented test-run of the freshly + generated code). +- OpenClaw: the `amphiloop-build` skill and its host orchestration — + Steps B → C → D → E0/E → F → G. + +All five phases are construction-phase: the orchestrator is driving, the user +is a *co-designer*, and the user must be able to see/steer at every meaningful +moment. + +**Out of scope**: once the build finishes and the user takes the generated +project and runs `uv run python main.py` themselves (or wires it into their +own CI/cron/etc.), any human interaction the program performs at that point +is **runtime business logic** owned by the bridgic framework's default +`HumanCall` mechanism. The protocol does not govern that channel. + +### Note on Phase 5's runtime file-bridge + +Phase 5 (verify) injects a `human_input` override and runs the program under +`monitor.sh`. When the program yields a `HumanCall` mid-execution, a file +bridge transports the prompt from the running program back up to the +orchestrator. **That bridge is a transport mechanism, not a protocol tier.** +Once the prompt arrives at the orchestrator (the verify agent in Claude Code, +or the OpenClaw host), the orchestrator is now in Tier 1 or Tier 2 and asks +the user using the rules below — the bridge does not absolve the orchestrator +of applying the protocol. + +## Capability tiers — pick the first that applies in your runtime + +### Tier 1 — Structured ask tool available + +If the runtime exposes a structured-question tool (Claude Code's +`AskUserQuestion`, or any platform equivalent that pops a labeled-options UI), +**use it directly**. Phrase the question with explicit numbered options. + +Do **not** also emit the same question as plain chat text alongside the tool +call — the question is sent once, through the tool. + +**Option-construction rules** (apply to every Tier 1 ask): + +- **Header is hard-capped — keep it ≤12 characters.** Claude Code's + `AskUserQuestion` UI truncates long headers and the truncation tail can + surface as garbled output (e.g. `Phase 1→2876…`). Treat the header as a + tab label, not a sentence: `Phase 1→2`, `Continue?`, `LLM mode`, + `Domain?`. If you can't fit the meaning in 12 characters, the meaning + belongs in the question body, not the header. +- **Don't duplicate the always-available free-text channel as an explicit + option.** `AskUserQuestion` (and most equivalents) already render a + permanent free-text input row beneath the structured options ("Chat about + this" in Claude Code). Adding a structured option whose action is "type + something" / "free input" / "describe in chat" is pure noise — that + channel is open by default. Reserve structured options for the *distinct + branches the orchestrator must commit to*. +- **Use however many options the decision actually has — don't pad.** If + there are only two real branches, ship two options. If you find yourself + adding a third option just to reach a round number and its content is + vague ("Type something", "Other", "Anything else"), delete it. A + meaningful third option looks like an escape hatch (`Cancel build`, + `Skip this check`, `Abort and revisit later`), not a filler. +- **Each option's description must add information beyond its label** — + what concretely happens next, what the user should reply, what side + effect kicks in. Do not paraphrase the label in slightly different words; + if you can delete the description without losing meaning, you wrote it + wrong. +- **Option descriptions must NOT re-state the pre-question summary.** The + summary above the AskUserQuestion already established context (what just + finished, what's coming, side effects). The option description's job is + to say what is **unique** to choosing *this* branch — typically one short + clause: `→ runs setup-env.sh now`, `→ stays on the menu`, `→ aborts the + build`. If two branches' descriptions both rehearse the same upcoming + pipeline outline, you wrote the summary into the options. Delete the + duplication; trust the summary. + +### Tier 2 — Free-text reply channel available + +The runtime can send the user a normal chat / message and receive a free-text +reply (e.g. an OpenClaw host conversation, a chat surface that holds a +`notifyChannel` / `notifyTarget` route). Send a clearly formatted message and +**wait for the user's explicit textual reply** before continuing. + +The message MUST: + +- Begin with a visible marker — e.g. `[USER ACTION REQUIRED]` or `[CHECKPOINT]`. +- State concretely what the user must do or decide. No paraphrase. +- State exactly how to reply — what word, what file, what click. Examples: + - "Reply `yes` to continue, or describe what you'd like changed." + - "Once you finish login in the open browser, reply `done`." +- Stay terse. One screenful max. + +### Tier 3 — No direct user channel → escalate to parent + +You are running where the user cannot be reached directly (typical case: a +Claude Code dispatchable subagent whose `tools:` list omits `AskUserQuestion`). +In this tier you **MUST** escalate; you **MUST NOT** poll a signal file in +silence and call that "asking the user". + +Stop work and return a structured "human input needed" status to the calling +command — include the prompt text, the resume signal the parent should hand +back, and any context the parent needs to ask coherently. The parent runs in a +higher tier (1 or 2) and asks on your behalf, then re-dispatches you with the +answer. + +If the agent's job genuinely requires interactive user input mid-task and +escalation is impractical, the cleaner fix is to add `AskUserQuestion` (or +the platform equivalent) to the agent's `tools:` list so it operates in Tier +1 — not to invent a polling workaround. + +## Phase checkpoint pattern + +At every checkpoint the orchestrator must: + +1. Send a short pre-question summary that combines (a) what just finished + (artifacts written, decisions made) and (b) what the next phase will do + plus any side effect worth a veto. **Total length cap: 3 visible lines + maximum across (a)+(b) combined** (not 3 lines per item). If you can't + fit it in 3 lines, you're describing too much — the user can read + `build_context.md` if they want detail. +2. The list of "things worth flagging as side effects" — files written, + env mutated, scripts run, processes spawned, real browsers opened, real + money / API quota spent, the generated program executed for the first + time — is a **selection menu, not an enumeration template**. Pick the + **one or two** items the user is most likely to want to veto on this + transition; ignore the rest. Do not produce a flowing prose paragraph + that lists every applicable category. +3. Ask "Continue to ?" via Tier 1 or Tier 2. The pre-question + summary lives **above** the question, not duplicated inside option + descriptions (see Tier 1 option-construction rules). +4. Wait for an explicit affirmative reply (`yes`, `y`, `go`, `continue`) + before advancing. Anything else (silence, `wait`, `let me look`, a + counter-question) means **do not advance** — answer the user's intervention + first and re-prompt the checkpoint when ready. + +Checkpoints are cheap when terse — one short summary plus a one-tap question. +Their value is the *option* to redirect, not the friction. A checkpoint that +overflows the visible area defeats its own purpose: the user cannot see the +question they are being asked. + +### Length self-check (run mentally before sending) + +- Pre-question summary: ≤3 lines? If no → trim. +- AskUserQuestion header: ≤12 chars? If no → trim. +- Each option description: 1 short clause that is **not** in the summary? + If no → rewrite or delete. +- Total surface (summary + question + 2–3 options): does it fit one screen + without scrolling? If no → cut the summary first, then the option + descriptions; the question text itself stays. + +**Where to place checkpoints**: at the boundary between major pipeline phases, +and additionally before any single sub-step that has a meaningful side effect +the user might want to veto or adjust — running a setup script that mutates +the toolchain, writing `.env`, spawning a worker process, kicking off a real +run of the generated program, or starting a multi-attempt fix-and-retry loop. +Skip checkpoints inside tight inner loops or for purely informational reads. + +## Anti-patterns — never do these + +- ❌ `echo "Please do X" && until [ -f /tmp/flag ]; do sleep 3; done` — the + user only sees a quiet "Running" indicator. They have no idea what is being + asked or that they are the bottleneck. This is the canonical violation. +- ❌ Auto-continuing after a fixed silence. Silence means the user is busy or + hasn't seen the request — not consent. +- ❌ Burying the question in a long log dump. The user will scroll past it + and the run stalls. +- ❌ Asking via Tier 1/2 *and* echoing the same question to a Bash polling + loop. Pick one channel; the second is noise that masks the real one. +- ❌ Treating the Phase 5 file-bridge as if it were the user channel. The + bridge ends at the orchestrator; the orchestrator still owes the user a + Tier 1/2 ask. +- ❌ Padding a Tier 1 ask with a "Type something" / "Other / free input" + structured option. The free-text input row is always rendered; an + explicit option pointing to it is redundant and signals to the user that + the structured options are not load-bearing. +- ❌ Writing an option description that just rephrases its label. Either + add concrete information (what happens next, how to reply) or drop the + description. + +## Quick decision flowchart + +``` +Need to ask the user something? +├── Have AskUserQuestion (or equivalent)? → Tier 1: ask directly. +├── Have a chat / message channel? → Tier 2: send + await reply. +└── Neither (subagent boundary, no user channel) + → Tier 3: escalate to parent. +``` diff --git a/commands/build.md b/commands/build.md index 16980fe..a20b777 100644 --- a/commands/build.md +++ b/commands/build.md @@ -39,6 +39,21 @@ Anything else in `$ARGUMENTS` (extra tokens, multiple flags) → stop and ask th > **build_context_path** — always `{PROJECT_ROOT}/.bridgic/build_context.md`. > **domain_context_path** — `{PLUGIN_ROOT}/domain-context//.md` when `SELECTED_DOMAIN` is resolved, otherwise the literal `none` (generic flow). `` is `explore.md` for Phase 3, `code.md` for Phase 4, `verify.md` for Phase 5. +## Human Interaction & Phase Checkpoints + +Every prompt to the user — including the gates between phases below — MUST follow `{PLUGIN_ROOT}/agents/human-interaction-protocol.md`. Read it once before starting Phase 1. Inside Claude Code you are in **Tier 1**: use `AskUserQuestion` for every checkpoint and every decision the user must make. + +**Mandatory checkpoint between every phase transition** (Phase 1→2, 2→3, 3→4, 4→5): + +1. Send a 1–3 line summary of what just finished (artifacts written, decisions made). +2. Name what the next phase is about to do, and call out side effects the user might want to veto: scripts that mutate the env (`setup-env.sh`), files about to be written (`.env`, `build_context.md`), real-environment probes, code generation, real runs of the generated program. +3. Ask via `AskUserQuestion`: "Continue to Phase N?" with options like `1. Yes, continue` / `2. Pause — I want to adjust something`. +4. Only advance on an explicit affirmative reply. On `2`, take whatever follow-up the user requests, then re-prompt the checkpoint. + +Do **not** chain phases automatically. The user must always have the option to interrupt at every boundary. + +Individual sub-steps inside a phase that have their own meaningful side effect (notably Phase 2 → `setup-env.sh`, which mutates the project's uv toolchain and writes `pyproject.toml`) follow the same rule — `amphibious-config.md` enumerates those mid-phase gates. + --- ## Phase 1: Initialize Task diff --git a/extensions/openclaw-skill/amphiloop-build/SKILL.md b/extensions/openclaw-skill/amphiloop-build/SKILL.md index 15187fa..04470a4 100644 --- a/extensions/openclaw-skill/amphiloop-build/SKILL.md +++ b/extensions/openclaw-skill/amphiloop-build/SKILL.md @@ -28,7 +28,7 @@ Turn a task description into a runnable bridgic-amphibious Python project. - **Inputs**: - ``: free-form natural-language task description from the user - ``: working directory the project will live under (ask the user; create it if it does not exist) -- **Output**: `//{amphi.py, main.py, log/, result/}` +- **Output**: `//{amphi.py, main.py, log/, result/, .py …}` — every generated `.py` file lives **inside** `/`. `/` itself only carries uv metadata, `.bridgic/`, `.amphiloop/`, `.env`, `TASK.md`. See "Output layout — MANDATORY" inside the AGENT_BRIEF template (Step E0) for the full rule and anti-patterns. ## Path resolution @@ -38,6 +38,27 @@ This skill ships **inside** the AmphiLoop repository at `/extensions/openc Execute the steps in order. Never skip; never re-order. Throughout, **do not write or edit code yourself** — every code-touching action goes through `` (opened in Step E). +### Human interaction & step checkpoints (★ read first ★) + +Every prompt to the user — including the gates between steps below and the in-step side-effect gates — MUST follow `{baseDir}/../../agents/human-interaction-protocol.md`. Inside OpenClaw the host is **Tier 2**: there is no `AskUserQuestion` here, only the chat / message channel captured in Step B (`notifyChannel` / `notifyTarget` / etc.), so every gate is a clearly formatted chat message that waits for the user's explicit textual reply. + +**Mandatory step-transition gates** (always send a 1–3 line summary + a "Continue?" question, wait for `yes` / `y` / `continue` before advancing): + +| Boundary | Why gate here | +|---|---| +| Step C → Step D | Config decisions are recorded; about to start probing the target environment (may open browsers, hit external sites, mutate user data). | +| Step D → Step E0 | Exploration finished; about to spawn the long-lived `` session and burn LLM tokens for code generation. | +| Step E → Step F | Code is frozen; about to run the generated program for the first time under `monitor.sh` (real side effects on the target environment, real API calls). | +| Each fix attempt inside Step F | Before appending FIX-N and re-engaging the worker, give the user a chance to inspect the failure or stop the retry. | + +**Mandatory in-step gates** (single side-effect actions inside a step): + +- Step C → before `setup-env.sh` runs — surfaced by `amphibious-config.md` Step 4.0; honor it via Tier 2 since OpenClaw has no `AskUserQuestion`. +- Step D → any `HUMAN:` handoff during exploration (login wall, CAPTCHA, etc.) — ask via Tier 2; never echo + poll. +- Step F → on `monitor.sh` exit code 2 — relay the runtime prompt to the user via Tier 2 per the verify methodology's OpenClaw addendum. + +The user must always have the option to interrupt and redirect at every gate. Silence is **not** consent. + ### Step A. Pick the coding worker (must ask the user) Send the user exactly: @@ -61,10 +82,12 @@ This step does not write code; do it directly. ### Step C. Phase 2 — Config (host runs this directly) 1. `read` the file `{baseDir}/../../agents/amphibious-config.md` to load the Phase 2 methodology. -2. Following that methodology, read `/TASK.md`, decide pipeline mode and any domain context, and `write` the result to `/.bridgic/build_context.md`. +2. Following that methodology, read `/TASK.md`, decide pipeline mode and any domain context, and `write` the result to `/.bridgic/build_context.md`. The methodology's questions (project mode, LLM, domain config) and its mid-step gate before `setup-env.sh` (Step 4.0) all use **Tier 2 chat messages** here — translate every "ask via `AskUserQuestion`" instruction into a clearly formatted chat question and wait for the user's explicit textual reply. This produces a markdown decision record, not code. Do it directly. +**Step C → D gate** (before continuing): summarize the recorded decisions (mode, llm_configured, domain) in 1–3 lines and ask the user via chat: "Proceed to Phase 3 (Explore)? This phase will probe the target environment described in TASK.md — depending on the task it may open browsers, hit external sites, or read local files. Reply `yes` to continue, or describe what you want changed first." + ### Step D. Phase 3 — Explore (host runs this directly) 1. `read` the file `{baseDir}/../../agents/amphibious-explore.md` to load the Phase 3 methodology. @@ -72,8 +95,12 @@ This produces a markdown decision record, not code. Do it directly. Writing notes is not coding — do it directly. +**HUMAN handoff during exploration** (login wall, CAPTCHA, manual confirmation, providing a token, etc.): the methodology in `amphibious-explore.md` already enumerates the three tiers and the anti-pattern; in this OpenClaw context you are running the methodology as the host, so the **Tier 2 case** applies — use the chat channel captured in Step B (`notifyChannel` / `notifyTarget`). + **Exception**: if the exploration genuinely needs a probe script (Python / JS / shell) to be authored to make further observations possible, treat that probe-script authorship as the **first** code-writing action of the run and jump to Step E0/E now (use the probe-script as the first TODO). +**Step D → E0 gate** (before continuing): summarize what exploration found — operation sequence sketch, any HUMAN steps in the plan, artifacts captured. Ask the user via chat: "Exploration complete (`/.bridgic/exploration/exploration_report.md`). Proceed to Phase 4 (Code Generation)? This will spawn a long-lived `` session and burn LLM tokens to write the project. Reply `yes` to continue, or describe what you want adjusted in the plan first." + ### Step E0. Prepare the work template (★ v8 ★ host writes the brief and TODO list before any coding) Before opening any worker session, host must write two communication files into `/.amphiloop/`. These are the entire interface between host and worker for this run. @@ -115,13 +142,24 @@ Before opening any worker session, host must write two communication files into The orchestrator may append new `[ ]` items to TODOS.md later (e.g. fixes after verification). When you receive a "continue" instruction, re-open TODOS.md, find the new open items, and resume from STEP 3. - ## Output layout + ## Output layout — MANDATORY + + Final deliverable lives **inside** `//`. amphi.py / main.py / log/ / result/ and every support module MUST be inside `/`. Dropping them at `/` directly is a hard error — the orchestrator will reject the run. + + `//` layout: + amphi.py ← entry, scaffold-created here + main.py ← entry, you write here + log/ ← runtime logs + result/ ← task outputs + .py ← any extra helpers go here too, never at + + `/` only carries uv metadata (`pyproject.toml`, `uv.lock`, `.venv/`, `.env`), the AmphiLoop workspace (`.bridgic/`, `.amphiloop/`), and `TASK.md`. Never write code into `/.bridgic/` — that is the orchestrator's workspace. - Final deliverable goes under `//`: - amphi.py - main.py - log/ - result/ + Anti-patterns to avoid: + - ❌ `amphi.py` / `main.py` at `/` (sibling of `pyproject.toml`) + - ❌ Treating `/` as a Python import package (adding `__init__.py`, importing it from a sibling main.py at `/`) + - ❌ `log/` or `result/` at `/` instead of inside `/` + - ❌ Any `.py` file written under `/.bridgic/` ``` When writing this file, substitute real absolute paths for `{baseDir}/../..` and `` and `` (drop lines whose source files don't exist for the current run). @@ -131,14 +169,16 @@ Before opening any worker session, host must write two communication files into ```markdown # AmphiLoop build TODOs - - [ ] T1: Scaffold `/` via the bridgic-amphibious CLI (per skills/bridgic-amphibious/SKILL.md). Create empty log/ and result/ dirs. - - [ ] T2: In amphi.py, define the CognitiveContext for this task following build_context.md. - - [ ] T3: In amphi.py, implement on_workflow yielding ActionCalls that mirror the Operation Sequence in exploration_report.md. - - [ ] T4: In amphi.py, implement on_agent think_units for AMPHIFLOW fallback per the methodology. - - [ ] T5: Register task tools (FunctionToolSpec) for any domain-specific operations the workflow needs. - - [ ] T6: Implement helper functions for parsing VOLATILE refs from ctx.observation. - - [ ] T7: Write main.py with LLM init (per skills/bridgic-llms/SKILL.md), tools assembly, and the agent.arun(...) call. - - [ ] T8: Run `uv run main.py` once dry to confirm it boots without import or syntax errors. + - [ ] T1: Scaffold inside `/`. Run `mkdir -p / && cd / && uv run bridgic-amphibious create --task ""`. The `cd` is REQUIRED — running the CLI from `` drops `amphi.py` at the wrong level. After it returns, verify `//amphi.py` exists; if it landed at `/amphi.py` instead, move it inside `/` and fix. + - [ ] T2: Create empty `//log/` and `//result/` dirs (NOT at `/log/` or `/result/`). + - [ ] T3: In `/amphi.py`, define the CognitiveContext for this task following build_context.md. + - [ ] T4: In `/amphi.py`, implement on_workflow yielding ActionCalls that mirror the Operation Sequence in exploration_report.md. + - [ ] T5: In `/amphi.py`, implement on_agent think_units for AMPHIFLOW fallback per the methodology. + - [ ] T6: Register task tools (FunctionToolSpec) for any domain-specific operations the workflow needs. Inline in `/amphi.py` (or split into `/tools.py` per the methodology — never at `/`). + - [ ] T7: Implement helper functions for parsing VOLATILE refs from ctx.observation. Same placement rule — inside `/`. + - [ ] T8: Write `/main.py` with LLM init (per skills/bridgic-llms/SKILL.md), tools assembly, and the agent.arun(...) call. + - [ ] T9: Run `cd && uv run python /main.py` once dry to confirm it boots without import or syntax errors. + - [ ] T10: Final layout check — `ls ` should show `pyproject.toml`, `uv.lock`, `.venv/`, `.env`, `.bridgic/`, `.amphiloop/`, `TASK.md`, `/` and NOTHING ELSE. Any `.py` file at `/` is a violation. ``` 3. Send a short progress note to the user: "Worker brief and TODO list written to `/.amphiloop/`. Opening coding-agent session next." @@ -165,27 +205,38 @@ This is the first code-writing action of the run (unless Step D opened the sessi 4. Monitor with `process action:log sessionId:` until the sentinel `### AMPHI-TASK-DONE ###` appears. Optionally `read` `/.amphiloop/TODOS.md` periodically to watch `[x]` count rise. -5. **Do NOT kill the session.** Continue to Step F. +5. **Do NOT kill the session.** + +6. **Step E → F gate** (before continuing): summarize the worker's output — list the files now under `//` (`amphi.py`, `main.py`, etc.), confirm `[x]` count on TODOS.md. Ask the user via chat: "Code generation complete. Proceed to Phase 5 (Verify)? This will run the generated program for the first time under `monitor.sh` — it will execute against the real target environment, may make real API calls, and may surface runtime `HumanCall` prompts you'll need to answer. Reply `yes` to continue, or `pause` to inspect the generated code first." Wait for the explicit affirmative reply, then continue to Step F. ### Step F. Phase 5 — Verify (host runs verify; bugs flow back via TODOS.md ★) 1. `read` the file `{baseDir}/../../agents/amphibious-verify.md` to load the Phase 5 methodology. 2. Run `{baseDir}/../../scripts/run/monitor.sh` against the generated project via `bash` (or follow whatever execution recipe the methodology prescribes for this run). Collect the output. -3. Decide: + + **If `monitor.sh` exits with code 2** (the running program hit a `HumanCall`), follow the verify methodology's **OpenClaw addendum** — host reads `/.bridgic/verify/human_request.json`, relays the prompt to the user via Tier 2 chat, writes the user's reply into `human_response.json`, and re-invokes `monitor.sh`. **Never** invent a polling loop here; the protocol forbids it. + +3. Decide based on the exit: - **Pass** — proceed to Step G. - - **Fail, root cause is in the generated code** (logic error, missing import, wrong API call, etc.): - - Send a short progress note to the user ("Phase 5 verify failed; appending FIX TODOs (attempt N/3) and asking worker to continue"). - - **Append** one or more FIX entries to `/.amphiloop/TODOS.md` (use `read` then `write` the full new content; the worker is sentinel-waiting and not touching the file right now, so there is no write conflict). Format each entry as: + - **Fail, root cause is in the generated code** (logic error, missing import, wrong API call, etc.) — apply a **fix-attempt gate** before re-engaging the worker: + - Send the user via chat: `[CHECKPOINT]` Phase 5 verify failed (attempt N/3). One-line root cause: ``. Proposed fix: ``. Reply `yes` to append FIX-N to TODOS.md and ask the worker to retry; reply `stop` to abort and inspect manually; or reply with edits to the proposed fix wording. + - Wait for the user's explicit reply. + - On `yes` (or an edited fix description): **append** one or more FIX entries to `/.amphiloop/TODOS.md` (use `read` then `write` the full new content; the worker is sentinel-waiting and not touching the file right now, so there is no write conflict). Format each entry as: ```markdown - [ ] FIX-N: : ``` Use a stable monotonic N across attempts (FIX-1, FIX-2, ...). - - Submit a one-line continue prompt to the **same** `` via `process action:submit sessionId: data:`. The prompt is literally: + - Submit a one-line continue prompt to the **same** `` via `process action:submit sessionId: data:`: > New FIX entries appended to `.amphiloop/TODOS.md`. Re-read TODOS.md and resume from the first open `[ ]` item. Same rules as before: tick each item to `[x]` as you finish, then print `### AMPHI-TASK-DONE ###` and wait. DO NOT exit. - Monitor with `process action:log` until the sentinel reappears. - Re-run verification (return to Step F.2). - - **Fail, root cause is NOT code** (missing env var, missing credential, network issue, missing input data) — fix it yourself with `bash` / `write`. Do not append a FIX TODO and do not submit to the worker. Re-run verification. + - On `stop` → proceed to Step G with `fail` status and the user-aborted reason. + - **Fail, root cause is NOT code** (missing env var, missing credential, network issue, missing input data): + - Send the user via chat: `[USER ACTION REQUIRED]` Phase 5 failed for a non-code reason: ``. Reply with the missing value (e.g. an env var assignment), or `cancel` to stop the run. + - Apply the user's instructions yourself with `bash` / `write` (do not append a FIX TODO and do not submit to the worker). + - Re-run verification. + 4. Cap fix attempts at 3. After 3 consecutive code-fix attempts that still fail, stop the loop and proceed to Step G with a `fail` status. ### Step G. Cleanup and report From 18c31f164e9d7858aed8a407b1ecf27fd95b733b Mon Sep 17 00:00:00 2001 From: NiceCode666 Date: Fri, 1 May 2026 19:19:43 +0800 Subject: [PATCH 4/4] docs: update SKILL.md for improved argument parsing and pipeline clarity Enhanced the SKILL.md documentation to clarify argument parsing and the overall pipeline structure for the AmphiLoop build process. Added detailed sections on input handling, domain selection, and mandatory step transitions to ensure users understand the flow and requirements during project setup. This update aims to streamline user interaction and reinforce best practices in project organization. --- .../openclaw-skill/amphiloop-build/SKILL.md | 195 +++++++++++++----- 1 file changed, 148 insertions(+), 47 deletions(-) diff --git a/extensions/openclaw-skill/amphiloop-build/SKILL.md b/extensions/openclaw-skill/amphiloop-build/SKILL.md index 04470a4..2d128af 100644 --- a/extensions/openclaw-skill/amphiloop-build/SKILL.md +++ b/extensions/openclaw-skill/amphiloop-build/SKILL.md @@ -14,52 +14,81 @@ metadata: Turn a task description into a runnable bridgic-amphibious Python project. -**Division of labor**: +## Architecture - **Host (you)** — the brain: reasoning, planning, verifying. You prepare a working directory with `.amphiloop/AGENT_BRIEF.md` (what the worker must read) + `.amphiloop/TODOS.md` (what the worker must do). When verify finds bugs, you append them as new TODO entries. - **`coding-agent` skill** — the hands: a worker (`claude`/`codex`/`opencode`/`pi`) reads the brief, reads the bridgic-* SKILL.md files it points to so the API is correct, then works through TODOS.md ticking items off as it goes. -**Communication channel** is the working directory, not the prompt. The prompt to the worker stays short (~200 chars: "read AGENT_BRIEF.md, read TODOS.md, work through them"). Methodology, API references, and bug reports all flow through files. This avoids context overflow, lets the host monitor progress by re-reading TODOS.md, and forces the worker to actually read the bridgic skill SKILL.md files instead of hoping its prompt mentioned them. +**Communication channel** is the working directory, not the prompt. The kickoff prompt stays short (~200 chars: "read AGENT_BRIEF.md, read TODOS.md, work through them"). Methodology, API references, and bug reports all flow through files. This avoids context overflow, lets the host monitor progress by re-reading TODOS.md, and forces the worker to actually read the bridgic skill SKILL.md files. **Single long-lived worker session** for the whole run (strictly sequential) so the worker carries context from initial generation into any follow-up fix. -## Inputs / Outputs +## Argument parsing -- **Inputs**: - - ``: free-form natural-language task description from the user - - ``: working directory the project will live under (ask the user; create it if it does not exist) -- **Output**: `//{amphi.py, main.py, log/, result/, .py …}` — every generated `.py` file lives **inside** `/`. `/` itself only carries uv metadata, `.bridgic/`, `.amphiloop/`, `.env`, `TASK.md`. See "Output layout — MANDATORY" inside the AGENT_BRIEF template (Step E0) for the full rule and anti-patterns. +`/amphiloop_build [--]` -## Path resolution +- **`--` present** (e.g. `--browser`) → set `SELECTED_DOMAIN = ` and skip Phase 1 auto-detection. Validate that `{baseDir}/../../domain-context//` exists. If it does not, list available domains and ask the user to pick one or rerun without a flag. +- **No flag** → leave `SELECTED_DOMAIN` unresolved; resolve it during Phase 1's auto-detection step. +- **Anything else** (extra tokens, multiple flags, free-form text) → stop and ask the user to clarify. Do not silently treat free-form text as TASK.md content — Phase 1 owns TASK.md construction. -This skill ships **inside** the AmphiLoop repository at `/extensions/openclaw-skill/amphiloop-build/`. The OpenClaw `{baseDir}` macro at runtime resolves to the directory containing this SKILL.md, so the AmphiLoop repo root is always `{baseDir}/../..`. The skill reads the methodology files under `{baseDir}/../../agents/` and helper scripts under `{baseDir}/../../scripts/run/` directly via that path — **do not ask the user for an AmphiLoop path**. +## Pipeline overview -## Mandatory flow +``` +A0. Parse arguments ── argument handling (see above) +A. Pick coding worker ── user picks claude / codex / opencode / pi +B. Prepare working directory ── confirm , capture notification route +B'. Initialize Task (Phase 1) ── seed TASK.md template, user fills in, validate, domain auto-detect +C. Configure & Setup (Phase 2) ── project mode, LLM config, env setup → build_context.md +D. Explore (Phase 3) ── probe target environment → .bridgic/explore/exploration_report.md +E0. Prepare work template ── write AGENT_BRIEF.md + TODOS.md +E. Generate Code (Phase 4) ── open coding-agent session, worker completes TODOs +F. Verify (Phase 5) ── run monitor.sh, fix-attempt loop via TODOS.md +G. Cleanup & report ── kill session, send summary +``` -Execute the steps in order. Never skip; never re-order. Throughout, **do not write or edit code yourself** — every code-touching action goes through `` (opened in Step E). +> **Path variables** — used throughout this document: +> +> | Variable | Resolves to | +> |---|---| +> | `{baseDir}` | Directory containing this SKILL.md | +> | `{baseDir}/../..` | AmphiLoop repository root (agents, skills, templates, scripts, domain-context) | +> | `` | User-confirmed working directory for the generated project (set in Step B) | +> | `build_context_path` | `/.bridgic/build_context.md` | +> | `domain_context_path` | `{baseDir}/../../domain-context//.md` when resolved; `none` otherwise | +> +> `build_context.md` is the single source of truth for Phase 3→5 — every later step reads it for context, and Phase 3 / Phase 4 each fill their `## Outputs` placeholder after completing. -### Human interaction & step checkpoints (★ read first ★) +This skill reads methodology files from `{baseDir}/../../agents/`, the task template from `{baseDir}/../../templates/build-task-template.md`, domain-context from `{baseDir}/../../domain-context//`, and helper scripts from `{baseDir}/../../scripts/run/` — **do not ask the user for an AmphiLoop path**. -Every prompt to the user — including the gates between steps below and the in-step side-effect gates — MUST follow `{baseDir}/../../agents/human-interaction-protocol.md`. Inside OpenClaw the host is **Tier 2**: there is no `AskUserQuestion` here, only the chat / message channel captured in Step B (`notifyChannel` / `notifyTarget` / etc.), so every gate is a clearly formatted chat message that waits for the user's explicit textual reply. +## Human interaction & step checkpoints -**Mandatory step-transition gates** (always send a 1–3 line summary + a "Continue?" question, wait for `yes` / `y` / `continue` before advancing): +Every prompt to the user — including the gates between steps and in-step side-effect gates — MUST follow `{baseDir}/../../agents/human-interaction-protocol.md`. Read it once before starting Step A0. The host operates at **Tier 2**: every gate is a clearly formatted chat message that waits for the user's explicit textual reply. + +**Mandatory step-transition gates** (send a 1–3 line summary + "Continue?" question; wait for `yes` / `y` / `continue` before advancing): | Boundary | Why gate here | |---|---| -| Step C → Step D | Config decisions are recorded; about to start probing the target environment (may open browsers, hit external sites, mutate user data). | -| Step D → Step E0 | Exploration finished; about to spawn the long-lived `` session and burn LLM tokens for code generation. | -| Step E → Step F | Code is frozen; about to run the generated program for the first time under `monitor.sh` (real side effects on the target environment, real API calls). | -| Each fix attempt inside Step F | Before appending FIX-N and re-engaging the worker, give the user a chance to inspect the failure or stop the retry. | +| Step B' → Step C | Phase 1 finished — TASK.md validated, `SELECTED_DOMAIN` resolved. About to enter Phase 2 which collects pipeline mode, LLM credentials, and runs `setup-env.sh`. | +| Step C → Step D | Config decisions recorded. About to probe the target environment (may open browsers, hit external sites, mutate user data). | +| Step D → Step E0 | Exploration finished. About to spawn the long-lived worker session and burn LLM tokens for code generation. | +| Step E → Step F | Code frozen. About to run the generated program for the first time under `monitor.sh` (real side effects, real API calls). | +| Each fix attempt in Step F | Before appending FIX-N and re-engaging the worker, give the user a chance to inspect the failure or stop. | -**Mandatory in-step gates** (single side-effect actions inside a step): +**Mandatory in-step gates**: -- Step C → before `setup-env.sh` runs — surfaced by `amphibious-config.md` Step 4.0; honor it via Tier 2 since OpenClaw has no `AskUserQuestion`. -- Step D → any `HUMAN:` handoff during exploration (login wall, CAPTCHA, etc.) — ask via Tier 2; never echo + poll. -- Step F → on `monitor.sh` exit code 2 — relay the runtime prompt to the user via Tier 2 per the verify methodology's OpenClaw addendum. +- Step C → before `setup-env.sh` runs (surfaced by `amphibious-config.md` Step 4.0). +- Step D → any `HUMAN:` handoff during exploration (login wall, CAPTCHA, etc.) — ask via chat; never echo + poll. +- Step F → on `monitor.sh` exit code 2 — relay the runtime prompt to the user per the verify methodology's OpenClaw addendum. The user must always have the option to interrupt and redirect at every gate. Silence is **not** consent. -### Step A. Pick the coding worker (must ask the user) +--- + +### Step A0. Parse arguments + +See [Argument parsing](#argument-parsing) above. After this step, `SELECTED_DOMAIN` is either a valid domain name or unresolved. + +### Step A. Pick the coding worker Send the user exactly: @@ -74,34 +103,96 @@ If the user replies with anything other than the four valid options, ask again r ### Step B. Prepare the working directory 1. Confirm `` with the user; offer a sensible default (e.g., a fresh `mktemp -d`) if they have not specified one. The AmphiLoop repo path does **not** need to be asked — it is `{baseDir}/../..` by construction. -2. Use the `write` tool to write `` verbatim into `/TASK.md`. -3. Capture the OpenClaw notification route of the current conversation: `notifyChannel`, `notifyTarget`, `notifyAccount`, `notifyReplyTo`, `notifyThreadId`. You will need them later for Step G. +2. Capture the OpenClaw notification route of the current conversation: `notifyChannel`, `notifyTarget`, `notifyAccount`, `notifyReplyTo`, `notifyThreadId`. You will need them later for Step G. + +This step does not write code and does not write `TASK.md`. Do it directly. + +--- + +### Step B'. Phase 1 — Initialize Task + +1. **Seed the template.** `read` `{baseDir}/../../templates/build-task-template.md`, then `write` its contents **verbatim** to `/TASK.md`. Do not modify, summarize, or pre-fill any section. + +2. **Tell the user to fill it in.** Send a chat message: + + > A task template has been created at `/TASK.md`. Please open it, fill in the four sections (`Task Description` / `Expected Output` / `Domain References` / `Notes`), save, and reply `done` to continue (or `cancel` to abort). + +3. **Wait for an explicit `done` reply.** Silence is **not** consent. Do not poll a flag file; do not auto-advance after a fixed wait. Any other reply (counter-question, "wait", silence) is handled before re-prompting. + +4. **Read TASK.md back** and parse the four sections: + - **Task Description** — goal of the project. + - **Expected Output** — what indicates success. + - **Domain References** — list of paths to domain reference files (may be empty). Each entry may be a SKILL.md, CLI help dump, SDK doc, style guide, or any other material that teaches the agents *how to act* or *what rules to follow*. + - **Notes** — optional additional constraints. + +5. **Validate.** + - `Task Description` must be non-empty. + - `Expected Output` must be non-empty. + - For every `Domain References` entry that is not a comment / example / blank line: resolve relative paths against ``, use absolute paths as-is, and confirm the file exists on disk. **Any missing path is a hard validation error.** + - On any failure: send a chat message naming the specific field / path that failed, ask the user to fix `TASK.md` and reply `done` again. Loop until validation passes (or the user replies `cancel`). + +6. **Domain auto-detection** — execute **only** if `SELECTED_DOMAIN` is still unresolved after Step A0: + 1. List subdirectories under `{baseDir}/../../domain-context/`. Each subdirectory is a candidate domain. + 2. For each candidate, `read` its `intent.md` (the matching criteria for that domain). + 3. Compare `Task Description + Expected Output + Notes` against each candidate's `intent.md`. Pick the **single best match**, or `none` if no candidate has strong signals. + 4. **If a candidate matches**, present the decision: + + > Detected domain: **``**. Use the pre-distilled `` context for exploration, code generation, and verification? + > + > Reply `1` / `yes` — use `` context. + > Reply `2` / `no` — proceed with the generic (domain-agnostic) flow. + > Reply `3 ` — specify a different domain explicitly. + + On `1` set `SELECTED_DOMAIN = `. On `2` leave unresolved (generic flow). On `3 ` validate that `{baseDir}/../../domain-context//` exists and set `SELECTED_DOMAIN = `; otherwise re-prompt. + 5. **If no candidate matches**, do not ask — silently proceed with the generic flow (`SELECTED_DOMAIN` stays unresolved). + + After this step `SELECTED_DOMAIN` is either a valid domain name or unresolved (generic). + +7. **Step B' → Step C gate**: summarize in 1–3 lines what just landed (`TASK.md` validated, `SELECTED_DOMAIN = `, count of resolved Domain References) and ask: + + > Proceed to Phase 2 (Configure & Setup)? This phase will collect pipeline mode and LLM configuration, then run `setup-env.sh` (which modifies the uv toolchain and writes `pyproject.toml`). Reply `yes` to continue, or describe what you want adjusted first. + + Wait for the explicit affirmative before continuing. This step does not write code; do it directly. -### Step C. Phase 2 — Config (host runs this directly) +--- + +### Step C. Phase 2 — Configure & Setup + +Inputs from Step B' (already established): parsed `TASK.md` fields (`Task Description`, `Expected Output`, `Domain References` with resolved absolute paths, `Notes`) and `SELECTED_DOMAIN` (a valid domain name or unresolved/generic). **Step C does not re-decide the domain** — that is Phase 1's responsibility. 1. `read` the file `{baseDir}/../../agents/amphibious-config.md` to load the Phase 2 methodology. -2. Following that methodology, read `/TASK.md`, decide pipeline mode and any domain context, and `write` the result to `/.bridgic/build_context.md`. The methodology's questions (project mode, LLM, domain config) and its mid-step gate before `setup-env.sh` (Step 4.0) all use **Tier 2 chat messages** here — translate every "ask via `AskUserQuestion`" instruction into a clearly formatted chat question and wait for the user's explicit textual reply. +2. Following that methodology — and feeding it the pre-resolved inputs above — drive Project Mode selection (Workflow / Amphiflow), LLM Configuration (`check-dotenv.sh`), Domain-specific Configuration (only when `SELECTED_DOMAIN` is resolved and `{baseDir}/../../domain-context//config.md` exists), and Environment Setup (`setup-env.sh`); then `write` the consolidated decision record to `/.bridgic/build_context.md`. Present every question from the methodology as a clearly formatted chat message and wait for the user's explicit textual reply. This produces a markdown decision record, not code. Do it directly. -**Step C → D gate** (before continuing): summarize the recorded decisions (mode, llm_configured, domain) in 1–3 lines and ask the user via chat: "Proceed to Phase 3 (Explore)? This phase will probe the target environment described in TASK.md — depending on the task it may open browsers, hit external sites, or read local files. Reply `yes` to continue, or describe what you want changed first." +If `setup-env.sh` exits non-zero, the methodology doc says to **stop the entire pipeline** — respect that and do not enter Step D. + +On successful completion, `/.bridgic/build_context.md` exists and is the only artifact later steps need to read for context. -### Step D. Phase 3 — Explore (host runs this directly) +**Step C → D gate**: summarize the recorded decisions (mode, llm_configured, domain) in 1–3 lines and ask: "Proceed to Phase 3 (Explore)? This phase will probe the target environment described in TASK.md — depending on the task it may open browsers, hit external sites, or read local files. Reply `yes` to continue, or describe what you want changed first." + +--- + +### Step D. Phase 3 — Explore 1. `read` the file `{baseDir}/../../agents/amphibious-explore.md` to load the Phase 3 methodology. -2. Following that methodology, use `bash` to observe the target environment (running existing tools, taking notes, capturing samples). `write` the consolidated observations to `/.bridgic/exploration/exploration_report.md`. +2. Following that methodology, use `bash` to observe the target environment (running existing tools, taking notes, capturing samples). `write` the consolidated observations to `/.bridgic/explore/exploration_report.md`. + +Do not start Phase 4 until exploration is complete — the report and artifact files under `/.bridgic/explore/` are the sole bridge between Phase 3 and Phase 4. After exploration finishes, fill `## Outputs → exploration_report` in `build_context.md`. Writing notes is not coding — do it directly. -**HUMAN handoff during exploration** (login wall, CAPTCHA, manual confirmation, providing a token, etc.): the methodology in `amphibious-explore.md` already enumerates the three tiers and the anti-pattern; in this OpenClaw context you are running the methodology as the host, so the **Tier 2 case** applies — use the chat channel captured in Step B (`notifyChannel` / `notifyTarget`). +**HUMAN handoff during exploration** (login wall, CAPTCHA, manual confirmation, providing a token, etc.): the methodology already enumerates the tiers and anti-patterns; in this context the **Tier 2 case** applies — use the chat channel captured in Step B. + +**Exception**: if the exploration genuinely needs a probe script to be authored, treat that as the **first** code-writing action of the run and jump to Step E0/E (use the probe-script as the first TODO). -**Exception**: if the exploration genuinely needs a probe script (Python / JS / shell) to be authored to make further observations possible, treat that probe-script authorship as the **first** code-writing action of the run and jump to Step E0/E now (use the probe-script as the first TODO). +**Step D → E0 gate**: summarize what exploration found — operation sequence sketch, any HUMAN steps in the plan, artifacts captured. Ask: "Exploration complete (`/.bridgic/explore/exploration_report.md`). Proceed to Phase 4 (Code Generation)? This will spawn a long-lived `` session and burn LLM tokens to write the project. Reply `yes` to continue, or describe what you want adjusted in the plan first." -**Step D → E0 gate** (before continuing): summarize what exploration found — operation sequence sketch, any HUMAN steps in the plan, artifacts captured. Ask the user via chat: "Exploration complete (`/.bridgic/exploration/exploration_report.md`). Proceed to Phase 4 (Code Generation)? This will spawn a long-lived `` session and burn LLM tokens to write the project. Reply `yes` to continue, or describe what you want adjusted in the plan first." +--- -### Step E0. Prepare the work template (★ v8 ★ host writes the brief and TODO list before any coding) +### Step E0. Prepare the work template Before opening any worker session, host must write two communication files into `/.amphiloop/`. These are the entire interface between host and worker for this run. @@ -129,7 +220,7 @@ Before opening any worker session, host must write two communication files into - /TASK.md - /.bridgic/build_context.md - - /.bridgic/exploration/exploration_report.md + - /.bridgic/explore/exploration_report.md ## STEP 3 — work through TODOS.md @@ -162,9 +253,9 @@ Before opening any worker session, host must write two communication files into - ❌ Any `.py` file written under `/.bridgic/` ``` - When writing this file, substitute real absolute paths for `{baseDir}/../..` and `` and `` (drop lines whose source files don't exist for the current run). + When writing this file, substitute real absolute paths for `{baseDir}/../..` and ``. For the `domain-context//code.md` line: if `SELECTED_DOMAIN` is unresolved (the generic flow), **delete that line entirely**; otherwise replace `` with the resolved domain name (and confirm the file exists — drop the line if it does not). Same drop-if-missing rule applies to the optional `bridgic-browser` line. -2. **Write `/.amphiloop/TODOS.md`** — the initial Phase 4 task list. Use the `write` tool. Derive 5–8 items by mapping the Phase 1–4 sections of `{baseDir}/../../agents/amphibious-code.md` into checkboxes. Tailor wording to the current task. A typical seed: +2. **Write `/.amphiloop/TODOS.md`** — the initial Phase 4 task list. Use the `write` tool. Derive 5–8 items by mapping the sections of `{baseDir}/../../agents/amphibious-code.md` into checkboxes. Tailor wording to the current task. A typical seed: ```markdown # AmphiLoop build TODOs @@ -183,7 +274,9 @@ Before opening any worker session, host must write two communication files into 3. Send a short progress note to the user: "Worker brief and TODO list written to `/.amphiloop/`. Opening coding-agent session next." -### Step E. Phase 4 — Code (★ open the long-lived coding-agent session with a SHORT prompt ★) +--- + +### Step E. Phase 4 — Generate Code This is the first code-writing action of the run (unless Step D opened the session for a probe). The goal: open one worker session, capture ``, and submit a tiny pointer prompt that hands the worker over to AGENT_BRIEF.md + TODOS.md. @@ -192,7 +285,7 @@ This is the first code-writing action of the run (unless Step D opened the sessi - `Workdir: ` (so the worker starts in the right place; `cd` into it via the spawn config) - `Mode: INTERACTIVE` — launch the worker in REPL/interactive mode, **not** a one-shot. Concretely: `claude` must be launched **without** `--print`; `codex` **without** `exec`; `pi` and `opencode` in their REPL form. PTY rules and exact spawn flags are coding-agent's responsibility — do not hand-roll bash here. - `Background: yes` (coding-agent's hard rule). - - `This is a long-lived orchestrated session.` Tell coding-agent: do **not** require the worker to self-notify the user via `openclaw message send` per task. The orchestrator (this skill) will summarize at Step G. This deviation from the standard Mandatory Pattern is sanctioned by coding-agent's own contract ("if you do not have a trustworthy notification route, say so and do not claim that completion will notify the user automatically"). + - `This is a long-lived orchestrated session.` Tell coding-agent: do **not** require the worker to self-notify the user via `openclaw message send` per task. The orchestrator (this skill) will summarize at Step G. - `Capture the OpenClaw process sessionId returned by bash background:true and report it back so the orchestrator can remember it as .` 2. Once you have ``, submit the **kickoff prompt** via `process action:submit sessionId: data:`. The prompt is short and is the SAME shape every time: @@ -207,21 +300,25 @@ This is the first code-writing action of the run (unless Step D opened the sessi 5. **Do NOT kill the session.** -6. **Step E → F gate** (before continuing): summarize the worker's output — list the files now under `//` (`amphi.py`, `main.py`, etc.), confirm `[x]` count on TODOS.md. Ask the user via chat: "Code generation complete. Proceed to Phase 5 (Verify)? This will run the generated program for the first time under `monitor.sh` — it will execute against the real target environment, may make real API calls, and may surface runtime `HumanCall` prompts you'll need to answer. Reply `yes` to continue, or `pause` to inspect the generated code first." Wait for the explicit affirmative reply, then continue to Step F. +6. After the worker completes (sentinel appears and all TODOS are `[x]`), fill `## Outputs → generator_project` in `build_context.md` with the path to `//`. + +7. **Step E → F gate**: summarize the worker's output — list the files now under `//` (`amphi.py`, `main.py`, etc.), confirm `[x]` count on TODOS.md. Ask: "Code generation complete. Proceed to Phase 5 (Verify)? This will run the generated program for the first time under `monitor.sh` — it will execute against the real target environment, may make real API calls, and may surface runtime `HumanCall` prompts you'll need to answer. Reply `yes` to continue, or `pause` to inspect the generated code first." Wait for the explicit affirmative reply. + +--- -### Step F. Phase 5 — Verify (host runs verify; bugs flow back via TODOS.md ★) +### Step F. Phase 5 — Verify 1. `read` the file `{baseDir}/../../agents/amphibious-verify.md` to load the Phase 5 methodology. 2. Run `{baseDir}/../../scripts/run/monitor.sh` against the generated project via `bash` (or follow whatever execution recipe the methodology prescribes for this run). Collect the output. - **If `monitor.sh` exits with code 2** (the running program hit a `HumanCall`), follow the verify methodology's **OpenClaw addendum** — host reads `/.bridgic/verify/human_request.json`, relays the prompt to the user via Tier 2 chat, writes the user's reply into `human_response.json`, and re-invokes `monitor.sh`. **Never** invent a polling loop here; the protocol forbids it. + **If `monitor.sh` exits with code 2** (the running program hit a `HumanCall`), follow the verify methodology's **OpenClaw addendum** — host reads `/.bridgic/verify/human_request.json`, relays the prompt to the user via chat, writes the user's reply into `human_response.json`, and re-invokes `monitor.sh`. **Never** invent a polling loop here; the protocol forbids it. 3. Decide based on the exit: - **Pass** — proceed to Step G. - **Fail, root cause is in the generated code** (logic error, missing import, wrong API call, etc.) — apply a **fix-attempt gate** before re-engaging the worker: - - Send the user via chat: `[CHECKPOINT]` Phase 5 verify failed (attempt N/3). One-line root cause: ``. Proposed fix: ``. Reply `yes` to append FIX-N to TODOS.md and ask the worker to retry; reply `stop` to abort and inspect manually; or reply with edits to the proposed fix wording. + - Send the user: `[CHECKPOINT]` Phase 5 verify failed (attempt N/3). One-line root cause: ``. Proposed fix: ``. Reply `yes` to append FIX-N to TODOS.md and ask the worker to retry; reply `stop` to abort and inspect manually; or reply with edits to the proposed fix wording. - Wait for the user's explicit reply. - - On `yes` (or an edited fix description): **append** one or more FIX entries to `/.amphiloop/TODOS.md` (use `read` then `write` the full new content; the worker is sentinel-waiting and not touching the file right now, so there is no write conflict). Format each entry as: + - On `yes` (or an edited fix description): **append** one or more FIX entries to `/.amphiloop/TODOS.md` (use `read` then `write` the full new content; the worker is sentinel-waiting and not touching the file right now). Format each entry as: ```markdown - [ ] FIX-N: : @@ -233,12 +330,14 @@ This is the first code-writing action of the run (unless Step D opened the sessi - Re-run verification (return to Step F.2). - On `stop` → proceed to Step G with `fail` status and the user-aborted reason. - **Fail, root cause is NOT code** (missing env var, missing credential, network issue, missing input data): - - Send the user via chat: `[USER ACTION REQUIRED]` Phase 5 failed for a non-code reason: ``. Reply with the missing value (e.g. an env var assignment), or `cancel` to stop the run. + - Send the user: `[USER ACTION REQUIRED]` Phase 5 failed for a non-code reason: ``. Reply with the missing value (e.g. an env var assignment), or `cancel` to stop the run. - Apply the user's instructions yourself with `bash` / `write` (do not append a FIX TODO and do not submit to the worker). - Re-run verification. 4. Cap fix attempts at 3. After 3 consecutive code-fix attempts that still fail, stop the loop and proceed to Step G with a `fail` status. +--- + ### Step G. Cleanup and report 1. Kill the long-lived worker session: `process action:kill sessionId:`. @@ -248,14 +347,16 @@ This is the first code-writing action of the run (unless Step D opened the sessi - Number of coding-agent turns used (1 for the Phase 4 prompt, plus N for fix attempts) - If `fail`: the last failure summary so the user knows what to investigate +--- + ## Common constraints - **Never write code yourself.** All code-writing — Phase 4 generation, Phase 5 fixes, Phase 3 probe scripts, anything else — must go through `process:submit` to `` and the TODO list. Do not edit `.py` / `.ts` / `.sh` files with the host's `write` or `edit` tools. (`/.amphiloop/AGENT_BRIEF.md` and `TODOS.md` are written by the host — those are protocol files, not code.) - **All worker direction flows through TODOS.md.** Methodology, API references, and bug reports go into `/.amphiloop/AGENT_BRIEF.md` and `/.amphiloop/TODOS.md`, not into the prompt. The kickoff prompt and continue prompt are deliberately tiny pointers to those files. - **One worker, one sessionId, for the whole run.** `` is chosen once in Step A; `` is opened once in Step E (or earlier in a Step D probe) and reused throughout. -- **Strictly sequential, no concurrent file writes.** The worker handles one prompt at a time. The host writes to TODOS.md only while the worker is sentinel-waiting; the worker writes to TODOS.md only while it is actively working. This is enforced by the sequential prompt/sentinel cycle, so there is no concurrent edit conflict on the file. +- **Strictly sequential, no concurrent file writes.** The worker handles one prompt at a time. The host writes to TODOS.md only while the worker is sentinel-waiting; the worker writes to TODOS.md only while it is actively working. This is enforced by the sequential prompt/sentinel cycle. - **Sentinel discipline.** Every prompt you submit ends with the requirement to print `### AMPHI-TASK-DONE ###` so you have a deterministic completion signal. If after a generous wait the sentinel has not appeared but the expected files exist and the worker output has been quiet, treat that as completion (sentinel missed) and proceed. -- **Verify the worker actually read the brief.** After the kickoff prompt, scan `process:log` for evidence the worker called its file-read tool on the bridgic-* SKILL.md files listed in AGENT_BRIEF.md. If it skipped them (jumped straight to coding), inject one corrective `process:submit`: "You skipped the brief. STOP and read `.amphiloop/AGENT_BRIEF.md` STEP 1 files now before any further code." This guards against the v6 failure mode of the worker writing wrong APIs. +- **Verify the worker actually read the brief.** After the kickoff prompt, scan `process:log` for evidence the worker called its file-read tool on the bridgic-* SKILL.md files listed in AGENT_BRIEF.md. If it skipped them (jumped straight to coding), inject one corrective `process:submit`: "You skipped the brief. STOP and read `.amphiloop/AGENT_BRIEF.md` STEP 1 files now before any further code." - **Do not re-implement coding-agent.** Do not write `claude --print '...'` / `codex exec '...'` style bash here — coding-agent's SKILL.md owns spawn details (PTY, background, flags). This skill only tells coding-agent **what** to launch and **how** to drive it via `process:submit`. - **Progress visibility.** Send a one-line progress note before each `process:submit` so the user can follow the run. - **Notification deviation.** Tell coding-agent up front this is a long-lived orchestrated session and the orchestrator will summarize at Step G. Do not have the worker self-notify per task.