Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
622 changes: 622 additions & 0 deletions docs/superpowers/plans/2026-06-11-ag-ui-demo-toolbar.md

Large diffs are not rendered by default.

117 changes: 117 additions & 0 deletions docs/superpowers/specs/2026-06-11-ag-ui-demo-toolbar-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# examples/ag-ui — Canonical Demo Toolbar Parity

**Date:** 2026-06-11
**Status:** Design / spec
**Scope:** `examples/ag-ui/angular` (frontend port), `libs/ag-ui` (one small adapter feature). No python/graph changes expected (verify step included). Approach A from brainstorming: port the canonical demo-shell pattern, trimmed; examples stay standalone (duplicate, don't share).

---

## Goal

Replace the AG-UI example's static header ("AG-UI Chat / The Threadplane chat UI over the AG-UI transport") with the **canonical demo's toolbar**, at full functional parity where the transport allows:

- **Mode segmented control** — Embed / Popup / Sidebar (each a route rendering a different chat composition).
- **Model / Effort / Gen-UI / Theme selects** (`chat-select`), identical options to the canonical demo.
- **Dark/light toggle** — relocated into the toolbar's right side (canonical hosts it in the threads-sidenav footer, which this example doesn't have).
- **URL knobs + localStorage persistence** for all surviving controls.
- **Welcome-suggestion chips** in each mode's empty state (the canonical prompt list — every prompt works against this example's graph, which is a copy of chat's).

## Why this is now small

The canonical demo does NOT use LangGraph-specific config for the Model/Effort/Gen-UI knobs. It wraps `agent.submit` and merges `{model, reasoning_effort, gen_ui_mode}` into the **neutral contract's `input.state`** (`AgentSubmitInput.state`), and the graph reads those keys from **state** (`state.get("reasoning_effort")`, etc.). The AG-UI example's python graph is a copy of the same graph — it already reads these keys. The only missing plumbing is that the AG-UI adapter's `submit()` currently ignores `input.state`.

---

## Part 1 — Adapter: forward `input.state` (`libs/ag-ui`)

`libs/ag-ui/src/lib/to-agent.ts` `submit()`:

- When `input.state` is present, merge it into the AG-UI source agent's client state so it is carried on the run input (AG-UI `RunAgentInput.state` — the same channel the shared-state/json-render examples use server→client, used client→server here).
- Apply on **both** paths: the normal submit path and the resume path (`input.resume` + `input.state` may be combined per the contract).
- Also optimistically merge into the local `store.state` signal so `agent.state()` reflects what was sent (the server's next `STATE_SNAPSHOT` remains authoritative).

Unit tests beside the existing adapter specs: state forwarded on submit; state forwarded on resume; no state → unchanged behavior; local `state()` reflects the merge.

**Verify-first spike (before frontend work):** against the local uvicorn backend, send `input.state = { reasoning_effort: 'high' }` and confirm the graph observes it (e.g. via the run's behavior or a debug log). Expected to work because `ag-ui-langgraph` merges `RunAgentInput.state` into graph input. **Fallback if it doesn't:** keep the adapter API the same but carry the patch via `forwardedProps.state` and map it into graph input in `examples/ag-ui/python/src/server.py` — contained in the same PR.

## Part 2 — Frontend: `ag-ui-shell` (examples/ag-ui/angular)

Copy-and-trim the canonical `examples/chat/angular/src/app/shell/demo-shell.component.*` into `examples/ag-ui/angular/src/app/shell/ag-ui-shell.component.*` (house rule: examples are standalone; no cross-example imports):

**Toolbar (replaces the current header):**
- Segmented Mode control (Embed/Popup/Sidebar) — same markup/classes as canonical.
- `chat-select` fields: Model, Effort, Gen UI, Theme — same option lists as canonical (`modelOptions`, `effortOptions`, `genUiOptions`, `themeOptions`).
- Dark/light scheme toggle button (sun/moon SVG, same as canonical's) placed at the toolbar's right edge.

**Routing (new — the app currently has none):**
- Add the Angular router. Routes: `''` → redirect `/embed`; `/embed`, `/popup`, `/sidebar` each render a ported mode component (copied from `examples/chat/angular/src/app/modes/`), bound to `[agent]` and `[views]`.
- Port `welcome-suggestions.ts` + `WelcomeSuggestionsComponent` unchanged.
- `/` redirecting to `/embed` keeps the existing e2e helper (`openDemo(page, '/')`) and all 10 current specs working without changes.

**Agent wiring:**
- Keep `injectAgent()` + `a2uiBasicCatalog()`. Wrap `submit` exactly like the canonical shell: merge `{ model, reasoning_effort, gen_ui_mode }` from the toolbar signals into `input.state` on every send.
- The existing interrupt-panel region (`@if (agent.interrupt && agent.interrupt())` → `<chat-interrupt-panel>`) moves into the shell's main region, unchanged.

**Persistence:**
- Port the canonical localStorage persistence service and URL-knob sync, trimmed to the surviving keys: `mode` (via route), `model`, `effort`, `genui`, `theme`, color scheme. Same precedence as canonical (URL > stored > default; defaults omitted from the URL).
- Theme reflection: set `data-theme` / `data-threadplane-chat-theme` on `<html>` exactly as canonical does (including the default-light/dark auto-sync behavior).

**Trimmed out (explicitly NOT ported):** threads sidenav + scrim + hamburger, history search palette, projects, new-chat, archived threads, subagents region (the AG-UI adapter exposes no `subagents` signal), chat-debug, thread-id URL/routing.

## Part 3 — Out of scope

- No thread CRUD / multi-conversation anything (transport doesn't offer it; single conversation remains a deliberate property of this example).
- No python/graph changes (unless the Part-1 fallback triggers, which adds only a small `server.py` mapping).
- No website/homepage changes; the deployed demo updates via the existing `ag-ui demo → Vercel` CI path on merge.

## Part 4 — Forcing function: AG-UI capability verification matrix

This effort doubles as an **audit of the `@threadplane/ag-ui` adapter**: once the toolbar exposes the canonical demo's full surface, every capability is smoke-tested end-to-end over AG-UI via **Chrome MCP against the local servers** (real backend + real OpenAI key, like the approval verification on 2026-06-11). Each row gets a verdict; each gap becomes its own follow-up spec/plan.

| # | Capability | How to smoke it | Expected over AG-UI |
|---|------------|-----------------|---------------------|
| 1 | Streaming + markdown | "Tell me about coral reefs" | works (e2e-covered) |
| 2 | Citations | signals search+cite prompt | works (verified in recut) |
| 3 | Interrupt — Accept | approval prompt → Accept | works (verified 2026-06-11) |
| 4 | Interrupt — Edit / Respond / Ignore | same prompt, other three actions | **unverified** — exercise each resume payload |
| 5 | Model select | pick gpt-5-nano → send | run uses chosen model (`state.model`) |
| 6 | Effort select + reasoning display | Effort=high + puzzle prompt | reasoning honored; does `chat-reasoning` render over AG-UI (THINKING events)? **unverified** |
| 7 | Gen-UI: a2ui | feedback form / product card | works (verified) |
| 8 | Gen-UI: json-render | switch select → form prompt | **unverified** — tool-result rendering should be transport-neutral |
| 9 | A2UI theme presets + dark/light | theme select + toggle | frontend; confirm surface theming applies |
| 10 | Research-subagent prompt | suggestions chip | run completes; **no subagent card** (adapter exposes no `subagents`) — known gap, document UX |
| 11 | Stop mid-stream | send long prompt → stop | `abortRun` cleanly idles |
| 12 | Regenerate | regenerate icon on an answer | replace semantics work |
| 13 | Error recovery | kill backend mid-run | alert + next send recovers (e2e-covered) |

**Gap protocol — gaps are IN SCOPE.** This is a forcing function: the toolbar exposes the full canonical surface precisely so the AG-UI library is forced to support it. When the audit finds a gap:

- (a) **Small adapter/example bug** → fix immediately with a test, same PR.
- (b) **Capability gap** (e.g. reasoning/THINKING events not reduced, json-render bridge broken, subagent runs invisible) → root-cause it, record it in the working **findings report** (`docs/superpowers/specs/2026-06-11-ag-ui-capability-findings.md`), then **design and implement the fix as a phase of this campaign** — its own mini spec/plan + PR (library change in `libs/ag-ui`, chat-lib wiring, and/or backend mapping as needed), landed before the campaign is called done. The audit re-runs the matrix row to confirm closure.
- (c) The only deferral allowed: a gap whose fix requires changes **outside this repo** (the AG-UI protocol itself or the upstream `ag-ui-langgraph` package). Those get the findings-report treatment plus an explicit decision with the user on whether to work around (e.g. via CUSTOM events) — and a workaround, if chosen, is implemented in scope.

Controls in the demo stay visible and honest at every intermediate state.

**Deliverables added by this part:** the findings report (as a running log, ending with every row green or explicitly category-c), plus the gap-closure PRs themselves.

## Risks / verify items

1. **State→graph plumbing** (Part 1 spike). Fallback defined above.
2. **`gen_ui_mode=json-render` over AG-UI:** matrix row 8; if broken → findings report + follow-up plan (select stays visible; gap noted in the example README).
3. **E2E stability:** the `/`→`/embed` redirect must keep all 10 existing specs green unchanged; mode components and suggestions render against aimock fixtures exactly as in examples/chat.

## Testing

- **Adapter:** unit tests listed in Part 1 (`nx test ag-ui`).
- **Example e2e:** existing 10 specs green unchanged; new specs: (a) mode switch — `/popup` and `/sidebar` render their compositions (toolbar click updates route + composition mounts); (b) toolbar state — pick `Effort = high`, send a fixture prompt, assert the run proceeds (and, if cheaply assertable via aimock request capture, that the LLM request reflects the knob); (c) URL knob — load `/embed?effort=high` and assert the select reflects it.
- **Local manual verification** (Chrome, like today): toolbar renders; modes switch; selects persist across reload; dark/light toggles; approval + a2ui still work.
- **Lint/build:** `nx lint ag-ui`, `nx test ag-ui`, `nx lint examples-ag-ui-angular`, `nx build examples-ag-ui-angular`.

## Decomposition (for the plan)

The campaign is phased; it is done only when the matrix is green (or explicitly category-c with a user decision):

1. **Phase 1 — toolbar parity PR:** spike + adapter `input.state` forwarding (+tests); shell scaffold (routing, modes, suggestions); toolbar selects + submit wrapper + persistence + theme/scheme; e2e (10 green + 3 new). PR, merge on green.
2. **Phase 2 — capability audit:** Chrome MCP smoke of the full Part-4 matrix against local servers; findings report committed with verdict + root cause per row.
3. **Phase 3..N — gap closure:** one phase per gap, in priority order agreed with the user — mini spec/plan + implementation PR each (adapter / chat wiring / backend mapping), then re-run the matrix row to confirm. Category-(a) bugs are fixed wherever found without ceremony.
4. **Wrap:** verify the deployed demo; final matrix re-run noted in the findings report.
1 change: 1 addition & 0 deletions examples/ag-ui/angular/e2e/playwright.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ export default defineConfig({
fullyParallel: false,
workers: 1,
retries: process.env.CI ? 2 : 0,
timeout: 60_000,
reporter: process.env.CI ? [['list'], ['html', { open: 'never' }]] : 'list',
use: {
baseURL: 'http://localhost:4201',
Expand Down
33 changes: 33 additions & 0 deletions examples/ag-ui/angular/e2e/toolbar.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
// SPDX-License-Identifier: MIT
import { test, expect } from '@playwright/test';
import { openDemo } from './test-helpers';

test('modes: segmented control switches embed → popup → sidebar compositions', async ({ page }) => {
await openDemo(page);
await expect(page).toHaveURL(/\/embed/);

await page.getByRole('button', { name: 'Popup' }).click();
await expect(page).toHaveURL(/\/popup/);

await page.getByRole('button', { name: 'Sidebar' }).click();
await expect(page).toHaveURL(/\/sidebar/);

await page.getByRole('button', { name: 'Embed' }).click();
await expect(page).toHaveURL(/\/embed/);
await expect(page.getByRole('textbox', { name: /message|prompt/i })).toBeVisible();
});

test('knobs: effort select reflects ?effort=high and persists a change', async ({ page }) => {
await openDemo(page, '/embed?effort=high');
const effortField = page.locator('[data-field="effort"]');
await expect(effortField).toContainText(/high/i);
});

test('toolbar submit still streams (state merge does not break runs)', async ({ page }) => {
await openDemo(page);
const input = page.getByRole('textbox', { name: /message|prompt/i });
await input.fill('say hi briefly');
await page.getByRole('button', { name: /send/i }).click();
const assistant = page.locator('chat-message').filter({ hasText: /./ }).last();
await expect(assistant).toBeVisible({ timeout: 30_000 });
});
3 changes: 3 additions & 0 deletions examples/ag-ui/angular/src/app/app.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,19 @@ import {
provideBrowserGlobalErrorListeners,
provideZonelessChangeDetection,
} from '@angular/core';
import { provideRouter } from '@angular/router';
import { provideThreadplaneTelemetry } from '@threadplane/telemetry/browser';
import { provideChat } from '@threadplane/chat';
import { provideAgent } from '@threadplane/ag-ui';
import { environment } from '../environments/environment';
import { routes } from './app.routes';
import { ItineraryStore } from './itinerary-store';

export const appConfig: ApplicationConfig = {
providers: [
provideBrowserGlobalErrorListeners(),
provideZonelessChangeDetection(),
provideRouter(routes),
provideThreadplaneTelemetry(environment.telemetry),
provideAgent({ url: environment.agentUrl }),
provideChat({ license: environment.license }),
Expand Down
30 changes: 1 addition & 29 deletions examples/ag-ui/angular/src/app/app.html
Original file line number Diff line number Diff line change
@@ -1,29 +1 @@
<main class="ag-ui-demo">
<header class="ag-ui-demo__header">
<h1>AG-UI Chat</h1>
<p>The Threadplane chat UI over the AG-UI transport.</p>
</header>
<div class="ag-ui-demo__body">
<app-itinerary-panel class="ag-ui-demo__panel" />
<div class="ag-ui-demo__main">
@if (agent.interrupt && agent.interrupt()) {
<div class="ag-ui-demo__interrupt" role="region" aria-label="Approval required">
<chat-interrupt-panel [agent]="agent" (action)="onInterruptAction($event)" />
</div>
}
<chat
main
[agent]="agent"
[views]="catalog"
[clientTools]="clientTools"
class="ag-ui-demo__chat"
>
<div chatWelcomeSuggestions>
@for (s of suggestions; track s.value) {
<chat-welcome-suggestion [label]="s.label" [value]="s.value" (selected)="send($event)" />
}
</div>
</chat>
</div>
</div>
</main>
<ag-ui-shell />
13 changes: 13 additions & 0 deletions examples/ag-ui/angular/src/app/app.routes.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
// SPDX-License-Identifier: MIT
import { Routes } from '@angular/router';
import { EmbedMode } from './modes/embed-mode.component';
import { PopupMode } from './modes/popup-mode.component';
import { SidebarMode } from './modes/sidebar-mode.component';

export const routes: Routes = [
{ path: 'embed', component: EmbedMode },
{ path: 'popup', component: PopupMode },
{ path: 'sidebar', component: SidebarMode },
{ path: '', pathMatch: 'full', redirectTo: 'embed' },
{ path: '**', redirectTo: 'embed' },
];
87 changes: 3 additions & 84 deletions examples/ag-ui/angular/src/app/app.ts
Original file line number Diff line number Diff line change
@@ -1,93 +1,12 @@
// SPDX-License-Identifier: MIT
import { ChangeDetectionStrategy, Component } from '@angular/core';
import { injectAgent } from '@threadplane/ag-ui';
import {
ChatComponent,
ChatInterruptPanelComponent,
ChatWelcomeSuggestionComponent,
a2uiBasicCatalog,
type InterruptAction,
} from '@threadplane/chat';
import { ItineraryPanelComponent } from './itinerary-panel.component';
import { itineraryClientTools } from './client-tools';
import { AgUiShell } from './shell/ag-ui-shell.component';

@Component({
selector: 'app-root',
standalone: true,
changeDetection: ChangeDetectionStrategy.OnPush,
imports: [
ChatComponent,
ChatInterruptPanelComponent,
ChatWelcomeSuggestionComponent,
ItineraryPanelComponent,
],
imports: [AgUiShell],
templateUrl: './app.html',
})
export class App {
protected readonly agent = injectAgent();
// The a2ui-surface render block in <chat> is gated on a truthy `views`
// catalog — without it, a2ui surfaces parse but never mount and the
// render_a2ui_surface tool call shows only as a tool chip (issue #616).
protected readonly catalog = a2uiBasicCatalog();

// Built in an injection context (field initializer) so itineraryClientTools()
// can inject the shared ItineraryStore. These declare what the agent can do
// to the page; the browser executes each call against the same store the
// panel renders.
protected readonly clientTools = itineraryClientTools();

// Welcome chips spanning the demo's full capability surface — docs/citations,
// generative UI, human approval, the five itinerary client tools, and the
// research subagent. Selecting one submits its prompt verbatim.
protected readonly suggestions = [
{ label: 'Docs & citations', value: 'What do the docs say about streaming?' },
{ label: 'Generative UI', value: 'Build me a revenue dashboard' },
{ label: 'Human approval', value: 'Issue me a $50 refund' },
{ label: 'Read my itinerary', value: "What's on my itinerary?" },
{ label: 'Agent edits the page', value: 'Add the Louvre to day 2 of my trip' },
{ label: 'Consent-gated clear', value: 'Clear my day 2 plans' },
{ label: 'Research subagent', value: 'Research AG-UI and give me the highlights' },
];

protected send(value: string): void {
void this.agent.submit({ message: value });
}

/**
* Resolve a human-in-the-loop interrupt (request_approval). The
* chat-interrupt-panel emits a four-action vocabulary; map each to a resume
* payload and replay the run via AG-UI's resume path — `submit({ resume })`,
* which the adapter forwards as `forwardedProps.command.resume`. `edit` /
* `respond` use window.prompt as a demo affordance; a production app would
* inline a textarea editor.
*/
protected async onInterruptAction(action: InterruptAction): Promise<void> {
const interrupt = this.agent.interrupt?.();
if (!interrupt) return;

let resume: unknown;
switch (action) {
case 'accept':
resume = 'approved';
break;
case 'edit': {
const reason = (interrupt.value as { reason?: string })?.reason ?? '';
const edited = window.prompt(`Edit your response (current proposal: "${reason}"):`, 'approved');
if (edited == null) return;
resume = edited;
break;
}
case 'respond': {
const text = window.prompt('Respond to the agent:', '');
if (text == null) return;
resume = text;
break;
}
case 'ignore':
resume = 'denied';
break;
}

await this.agent.submit({ resume });
}
}
export class App {}
Loading
Loading