Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
295 changes: 295 additions & 0 deletions docs/superpowers/plans/2026-06-11-ag-ui-demo-itinerary-client-tools.md

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# AG-UI Demo — Itinerary Client Tools + Capability Suggestions — Design

**Date:** 2026-06-11
**Status:** Draft for review
**Scope:** Give the public AG-UI demo (`examples/ag-ui`, deployed by the "ag-ui demo → Vercel" job) two things it lacks: (1) **client tools** demonstrated against *frontend-owned application state* — a visible trip-itinerary panel the user and the agent both read and write — and (2) **welcome suggestion chips** covering every capability the demo now showcases. Today the demo's welcome screen has no chips, and the demo has no client-tools wiring.

## Goal

Make the demo show the full current capability surface in one screen, with the client-tools story told the strongest way: the agent reaches into **live application state the user can see and touch**. The user edits the itinerary panel directly; the agent edits the *same* state through client tools; the panel updates either way. This is the deliberate inverse of the json-render example (backend-owned state → frontend): here the state is frontend-owned and the agent reaches in.

## Non-goals

- No changes to the chat lib, render lib, or adapters — the demo consumes published APIs (`tools`/`action`/`view`/`ask`, `[clientTools]`).
- No browser-context tools (`get_local_context` etc.) — considered and dropped; one coherent state story beats two competing ones. They remain a cockpit-example concern.
- No system-prompt coaching for the client tools. The catalog descriptions must carry the behavior — the demo dogfoods the feature's core promise.

## 1. Frontend-owned state: `ItineraryStore`

New `examples/ag-ui/angular/src/app/itinerary-store.ts` — an Angular-signals store:

```ts
interface ItineraryStop { id: string; day: number; place: string; note?: string; }
```

- `stops = signal<ItineraryStop[]>(seed)` with computed day-grouping.
- Mutations: `add(day, place, note?)`, `move(place, toDay)` (case-insensitive place match), `remove(id)`, `clearDay(day)`, `reset()`.
- **Persistence:** `localStorage` key `ag-ui-demo:itinerary`; hydrate on init, write-through on mutation.
- **Seed** (so the demo reads instantly): Paris trip — Day 1: Louvre ("book tickets"), Eiffel Tower; Day 2: Musée d'Orsay.

## 2. Itinerary panel UI

New `itinerary-panel.component.ts`: a compact panel beside the chat (`.ag-ui-demo` becomes a two-column layout on desktop, stacked on mobile). Day-grouped list; per-stop remove button; small "add stop" input (day select + place text); a "Reset demo data" affordance. Pure consumer of `ItineraryStore` — the agent's writes appear live because both write the same signals.

## 3. Client tools (new `client-tools.ts`, passed via `<chat … [clientTools]>`)

| Tool | Kind | Schema (zod/v4) | Behavior |
|---|---|---|---|
| `get_itinerary` | action | `{}` | returns `{ days: [{ day, stops: [{ id, place, note? }] }] }` |
| `add_stop` | action | `{ day: number, place: string, note?: string }` | `store.add(...)`; returns the added stop |
| `move_stop` | action | `{ place: string, toDay: number }` | `store.move(...)`; returns moved stop or a not-found error result |
| `clear_day` | **ask** | `{ day: number }` | `ClearDayConfirmComponent`: "Clear all N stops on day {day}?" Confirm → `store.clearDay(day)` + emit `{ cleared: true, day }`; Cancel → `{ cleared: false }` |
| `day_card` | view | `{ day: number, places: string[] }` | `DayCardComponent` recap card in the transcript; model-filled, auto-acked |

Notes:
- `move_stop` matches by place name so casual prompts work; `get_itinerary` exposes ids so the model can disambiguate duplicates.
- `clear_day` is the human-gated destructive write — Cancel must leave state untouched and return a result the model can react to.
- `day_card` has no dedicated chip; the model reaches for it after edits at its own discretion (descriptions guide it).

## 4. Welcome suggestion chips

Project `chat-welcome-suggestion` chips (label/value, same pattern as the cockpit examples) into the `[chatWelcomeSuggestions]` slot in `app.html`, with a `send(value)` handler in `app.ts` (`agent.submit({ message })`). Seven chips, two rows:

**Row 1 — backend capabilities**
1. "What do the docs say about streaming?" → search_documents + citations
2. "Build me a revenue dashboard" → gen-UI surface (a2ui / json-render per mode)
3. "Issue me a $50 refund" → request_approval → interrupt panel

**Row 2 — client tools + subagent**
4. "What's on my itinerary?" → `get_itinerary`
5. "Add the Louvre to day 2 of my trip" → `add_stop` (panel visibly changes)
6. "Clear my day 2 plans" → `clear_day` ask-confirm
7. "Research AG-UI and give me the highlights" → research subagent

## 5. Backend changes (`examples/ag-ui/python`)

- `State` gains a `tools` channel (list) — `ag-ui-langgraph` merges `RunAgentInput.tools` into `state["tools"]`; the channel must exist for it to be retained.
- `generate` binds the client catalog: `bind_client_tools(llm, [search_documents, request_approval, research, gen_ui_tool], state)` (from the published `threadplane-client-tools>=0.0.1`).
- Routing: when the model's calls are **pure client-tool calls**, the turn must END (the browser executes and re-runs with the ToolMessage). Server tool calls keep routing to `ToolNode` exactly as today. Use the middleware's `has_server_tool_call` / `client_tool_names` helpers inside the demo's existing `should_continue`; a client-tool-only turn routes to the turn-ending path (skipping `attach_citations` is acceptable on those turns).
- `pyproject.toml` + regenerated `uv.lock`/`requirements.txt` add `threadplane-client-tools>=0.0.1`.

## 6. Verification

- **Live-LLM local smoke (standing gate):** serve the demo locally with a real key; drive all seven chips in a real browser; confirm the panel updates on agent writes, the clear-day confirm gates the write (both Confirm and Cancel paths), and continuations stream after each client-tool round-trip.
- **aimock e2e:** extend `examples/ag-ui/angular/e2e` with a client-tools spec — fixtures for the read ("What's on my itinerary?" → `get_itinerary` call + continuation) and the ask chain ("Clear my day 2 plans" → `clear_day` call; click Confirm; assert the panel's day-2 group empties and the continuation renders). Existing demo specs stay green.
- **Unit:** `ItineraryStore` (mutations, persistence round-trip, case-insensitive move, clearDay).

## 7. Deploy

Normal main-merge path: the demo backend redeploys via its existing job; the frontend via "ag-ui demo → Vercel". No new infra. The backend dep resolves from PyPI (already published).

## Risks / notes

- The `move_stop` name-match can miss on typos; the tool returns a structured not-found result so the model can recover by calling `get_itinerary` — acceptable for a demo.
- `localStorage` seeds can drift from new deploys; the panel's "Reset demo data" affordance is the escape hatch.
- Chip prompts are live-LLM prompts (not fixture-bound); wording was chosen to route reliably to the intended tool, validated during the live smoke.
22 changes: 22 additions & 0 deletions examples/ag-ui/angular/e2e/fixtures/itinerary.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"fixtures": [
{
"match": { "userMessage": "What's on my itinerary?", "hasToolResult": true },
"response": {
"content": "You have 3 stops planned: the Louvre and the Eiffel Tower on day 1, and the Musée d'Orsay on day 2."
}
},
{
"match": { "userMessage": "What's on my itinerary?" },
"response": { "toolCalls": [ { "name": "get_itinerary", "arguments": {} } ] }
},
{
"match": { "userMessage": "Clear my day 2 plans", "hasToolResult": true },
"response": { "content": "Done — day 2 is cleared." }
},
{
"match": { "userMessage": "Clear my day 2 plans" },
"response": { "toolCalls": [ { "name": "clear_day", "arguments": { "day": 2 } } ] }
}
]
}
7 changes: 7 additions & 0 deletions examples/ag-ui/angular/e2e/global-setup.ts
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,13 @@ export default async function globalSetup(): Promise<void> {
...process.env,
OPENAI_BASE_URL: aimock.baseUrl,
OPENAI_API_KEY: 'test-not-used',
// Run the backend in clone-and-run (unauthenticated) mode so the suite
// is hermetic. The dev proxy forwards /agent without an x-internal-token
// header, so if the developer's root .env defines AG_UI_INTERNAL_TOKEN
// (used for the Railway deploys) nx leaks it into process.env and the
// require_internal_token middleware 401s every run. Blanking it here
// matches what the proxy + transport expect.
AG_UI_INTERNAL_TOKEN: '',
},
stdio: 'pipe',
},
Expand Down
70 changes: 70 additions & 0 deletions examples/ag-ui/angular/e2e/itinerary-client-tools.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
// SPDX-License-Identifier: MIT
import { test, expect } from '@playwright/test';
import { attachBrowserHygiene, messageInput, openDemo, sendButton } from './test-helpers';

// Exercises the frontend-declared / frontend-executed client tools over the
// AG-UI transport against THIS app. The catalog ships to the backend, the model
// emits a tool call, the browser executes it against the shared ItineraryStore,
// the ToolMessage re-runs the graph, and the continuation streams back. Each
// test starts from the store seed (Day 1: Louvre + Eiffel Tower, Day 2: Musée
// d'Orsay) by clearing the persisted key before the page hydrates.
const STORAGE_KEY = 'ag-ui-demo:itinerary';

test.beforeEach(async ({ page }) => {
// Runs on every navigation, before app bootstrap — so ItineraryStore
// hydrates from SEED rather than a stale localStorage payload. openDemo's
// own localStorage.clear() runs after the first paint; this guarantees the
// key is already absent the moment the store reads it.
await page.addInitScript((key) => localStorage.removeItem(key), STORAGE_KEY);
});

test('panel renders the seeded itinerary', async ({ page }) => {
await openDemo(page);

const panel = page.getByRole('region', { name: 'Trip itinerary' });
await expect(panel).toBeVisible();
await expect(panel).toContainText('Louvre');
await expect(panel).toContainText('Eiffel Tower');
await expect(panel).toContainText("Musée d'Orsay");
});

test('read round-trip: get_itinerary executes in the browser and the run continues', async ({
page,
}) => {
await openDemo(page);
const hygiene = attachBrowserHygiene(page);

await messageInput(page).fill("What's on my itinerary?");
await sendButton(page).click();

// The first run returns only a tool call; the browser executes get_itinerary
// against the store, the ToolMessage re-runs the graph, and the continuation
// streams the recap. Wait on that final content rather than the first settle.
await expect(page.getByText('You have 3 stops planned')).toBeVisible({ timeout: 30_000 });

expect(hygiene.consoleErrors).toEqual([]);
});

test('ask chain: clear_day confirm mutates the panel and resumes the run', async ({ page }) => {
await openDemo(page);

await messageInput(page).fill('Clear my day 2 plans');
await sendButton(page).click();

// The clear_day ask renders its confirm component; the run is paused on the
// emitted tool result until the user decides.
const confirm = page.locator('app-clear-day-confirm');
await expect(confirm).toBeVisible({ timeout: 30_000 });
await expect(confirm).toContainText('day 2');

// Nothing has mutated yet — the panel still shows day 2's stop.
const panel = page.getByRole('region', { name: 'Trip itinerary' });
await expect(panel).toContainText("Musée d'Orsay");

// Confirming writes the store (panel updates live) and emits the tool result
// that resumes the run, whose continuation streams the closing line.
await confirm.getByRole('button', { name: 'Clear' }).click();

await expect(panel).not.toContainText("Musée d'Orsay");
await expect(page.getByText('Done — day 2 is cleared.')).toBeVisible({ timeout: 30_000 });
});
5 changes: 5 additions & 0 deletions examples/ag-ui/angular/src/app/app.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import { provideThreadplaneTelemetry } from '@threadplane/telemetry/browser';
import { provideChat } from '@threadplane/chat';
import { provideAgent } from '@threadplane/ag-ui';
import { environment } from '../environments/environment';
import { ItineraryStore } from './itinerary-store';

export const appConfig: ApplicationConfig = {
providers: [
Expand All @@ -16,5 +17,9 @@ export const appConfig: ApplicationConfig = {
provideThreadplaneTelemetry(environment.telemetry),
provideAgent({ url: environment.agentUrl }),
provideChat({ license: environment.license }),
// The frontend-owned itinerary is a single shared instance: the panel,
// the App component, and the client-tool ask component all inject it, so
// user edits and agent writes hit the same signals and render live.
ItineraryStore,
],
};
27 changes: 22 additions & 5 deletions examples/ag-ui/angular/src/app/app.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,27 @@
<h1>AG-UI Chat</h1>
<p>The Threadplane chat UI over the AG-UI transport.</p>
</header>
@if (agent.interrupt && agent.interrupt()) {
<div class="ag-ui-demo__interrupt" role="region" aria-label="Approval required">
<chat-interrupt-panel [agent]="agent" (action)="onInterruptAction($event)" />
<div class="ag-ui-demo__body">
<app-itinerary-panel class="ag-ui-demo__panel" />
<div class="ag-ui-demo__main">
@if (agent.interrupt && agent.interrupt()) {
<div class="ag-ui-demo__interrupt" role="region" aria-label="Approval required">
<chat-interrupt-panel [agent]="agent" (action)="onInterruptAction($event)" />
</div>
}
<chat
main
[agent]="agent"
[views]="catalog"
[clientTools]="clientTools"
class="ag-ui-demo__chat"
>
<div chatWelcomeSuggestions>
@for (s of suggestions; track s.value) {
<chat-welcome-suggestion [label]="s.label" [value]="s.value" (selected)="send($event)" />
}
</div>
</chat>
</div>
}
<chat main [agent]="agent" [views]="catalog" class="ag-ui-demo__chat" />
</div>
</main>
33 changes: 32 additions & 1 deletion examples/ag-ui/angular/src/app/app.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,23 @@ import { injectAgent } from '@threadplane/ag-ui';
import {
ChatComponent,
ChatInterruptPanelComponent,
ChatWelcomeSuggestionComponent,
a2uiBasicCatalog,
type InterruptAction,
} from '@threadplane/chat';
import { ItineraryPanelComponent } from './itinerary-panel.component';
import { itineraryClientTools } from './client-tools';

@Component({
selector: 'app-root',
standalone: true,
changeDetection: ChangeDetectionStrategy.OnPush,
imports: [ChatComponent, ChatInterruptPanelComponent],
imports: [
ChatComponent,
ChatInterruptPanelComponent,
ChatWelcomeSuggestionComponent,
ItineraryPanelComponent,
],
templateUrl: './app.html',
})
export class App {
Expand All @@ -22,6 +30,29 @@ export class App {
// render_a2ui_surface tool call shows only as a tool chip (issue #616).
protected readonly catalog = a2uiBasicCatalog();

// Built in an injection context (field initializer) so itineraryClientTools()
// can inject the shared ItineraryStore. These declare what the agent can do
// to the page; the browser executes each call against the same store the
// panel renders.
protected readonly clientTools = itineraryClientTools();

// Welcome chips spanning the demo's full capability surface — docs/citations,
// generative UI, human approval, the five itinerary client tools, and the
// research subagent. Selecting one submits its prompt verbatim.
protected readonly suggestions = [
{ label: 'Docs & citations', value: 'What do the docs say about streaming?' },
{ label: 'Generative UI', value: 'Build me a revenue dashboard' },
{ label: 'Human approval', value: 'Issue me a $50 refund' },
{ label: 'Read my itinerary', value: "What's on my itinerary?" },
{ label: 'Agent edits the page', value: 'Add the Louvre to day 2 of my trip' },
{ label: 'Consent-gated clear', value: 'Clear my day 2 plans' },
{ label: 'Research subagent', value: 'Research AG-UI and give me the highlights' },
];

protected send(value: string): void {
void this.agent.submit({ message: value });
}

/**
* Resolve a human-in-the-loop interrupt (request_approval). The
* chat-interrupt-panel emits a four-action vocabulary; map each to a resume
Expand Down
76 changes: 76 additions & 0 deletions examples/ag-ui/angular/src/app/clear-day-confirm.component.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
// SPDX-License-Identifier: MIT
import { ChangeDetectionStrategy, Component, computed, inject, input } from '@angular/core';
import { injectRenderHost } from '@threadplane/render';
import { ItineraryStore } from './itinerary-store';

/**
* The interactive component for the `clear_day` client tool (an `ask`). The
* model fills `day`; the user confirms or cancels. Because an ask emits the
* tool result and the handler layer cannot intercept it, the mutation happens
* HERE: Clear writes the shared `ItineraryStore` (so the panel updates live)
* and then announces the outcome via `injectRenderHost().result(...)`, which
* becomes the tool result that resumes the run. Cancel never touches the store.
*/
@Component({
selector: 'app-clear-day-confirm',
standalone: true,
changeDetection: ChangeDetectionStrategy.OnPush,
template: `
<div class="cdc">
<p class="cdc__summary">Clear all {{ count() }} stops on day {{ day() }}?</p>
<div class="cdc__actions">
<button type="button" class="cdc__btn cdc__btn--primary" (click)="clear()">Clear</button>
<button type="button" class="cdc__btn" (click)="cancel()">Cancel</button>
</div>
</div>
`,
styles: [
`
.cdc {
border: 1px solid var(--ngaf-chat-separator, #e5e7eb);
border-radius: 12px;
padding: 16px;
max-width: 360px;
}
.cdc__summary {
margin: 0 0 12px;
}
.cdc__actions {
display: flex;
gap: 8px;
}
.cdc__btn {
padding: 6px 14px;
border-radius: 8px;
border: 1px solid var(--ngaf-chat-separator, #e5e7eb);
background: transparent;
color: inherit;
cursor: pointer;
}
.cdc__btn--primary {
background: var(--ngaf-chat-accent, #2563eb);
color: #fff;
border-color: transparent;
}
`,
],
})
export class ClearDayConfirmComponent {
readonly day = input.required<number>();
private readonly store = inject(ItineraryStore);
private readonly host = injectRenderHost();

protected readonly count = computed(
() => this.store.stops().filter((s) => s.day === this.day()).length,
);

protected clear(): void {
const day = this.day();
const removed = this.store.clearDay(day);
this.host.result({ cleared: true, day, removed });
}

protected cancel(): void {
this.host.result({ cleared: false, day: this.day() });
}
}
Loading
Loading