Feat: 🛠️ Add client-executed tools & human-in-the-loop tool approval flow by vinitkadam03 · Pull Request #932 · prism-php/prism

vinitkadam03 · 2026-03-01T13:14:38Z

Summary

This PR introduces support for client-executed tools that are intended to be executed by the client/caller rather than by Prism, and a tool approval flow for tools that require explicit user consent before server-side execution.

1. Client Executed Tools

Motivation

Client-executed tools enable scenarios where tool execution must happen on the client side, such as:

Interactive user input - Rendering forms, confirmations, or option selectors based on tool call params passed by llm, then continuing the conversation with the user's selection (similar to how AI coding assistants ask clarifying questions during agentic workflows)
Browser automation - Controlling UI elements, clicking buttons, or navigating pages
Frontend-only operations - Accessing browser APIs, local storage, or device capabilities
Any tool where the server should not (or cannot) execute the logic

Behavior:

Client-executed tools are skipped during tool execution
Server-executed tools in the same request are still executed normally
When client-executed tools are detected, execution stops and control is returned to the caller
The LLM is not called for the next turn, allowing the client to execute the tool and continue the conversation
Response/stream ends with FinishReason::ToolCalls

Usage Example

use Prism\Prism\Facades\Tool;

// Explicit declaration (recommended)
$clientTool = Tool::as('browser_action')
    ->for('Perform an action in the user\'s browser')
    ->withStringParameter('action', 'The action to perform')
    ->clientExecuted();

// Implicit declaration (also works - omit using())
$clientTool = Tool::as('browser_action')
    ->for('Perform an action in the user\'s browser')
    ->withStringParameter('action', 'The action to perform');

2. Tool Approval Flow

Motivation

Tool approval enables scenarios where the server can execute a tool but should only do so after explicit user consent:

Destructive operations — File deletion, database mutations, account changes
Sensitive actions — Payments, sending emails, publishing content
Compliance requirements — Audit trails requiring explicit user authorization
Any tool where the server has the handler but needs human-in-the-loop approval before execution

The flow is stateless and operates in two phases.

Phase 1: Approval Request (stream stops)

When the LLM calls an approval-required tool, Prism emits a ToolApprovalRequestEvent and stops — returning control to the client.

Event chain (streaming):

StreamStartEvent → StepStartEvent → ToolCallEvent → ToolApprovalRequestEvent → StepFinishEvent → StreamEndEvent(ToolCalls)

Phase 2: Approval Resolution (tool executes, LLM continues)

The client sends a new request with messages containing ToolApprovalResponses. Before making the HTTP call to the LLM, resolveToolApprovalsAndYieldEvents() executes approved tools and creates denial results for denied ones. The LLM then continues the conversation with the tool results.

Event chain (streaming):

StreamStartEvent → ToolResultEvent → StepStartEvent → TextStartEvent → TextDeltaEvent → TextCompleteEvent → StepFinishEvent → StreamEndEvent(Stop)

Key behaviors:

StreamStartEvent is emitted from approval resolution (before the HTTP call), so the client knows the stream is live before tool results arrive
No duplicate StreamStartEvent — once emitted, the StreamState suppresses it from the subsequent HTTP stream
tool-output-available arrives without a prior tool-input-available in Phase 2 (that was sent in Phase 1)
Tool calls without any approval response default to denial
Denied tools yield a failed ToolResultEvent with the denial reason
A merged ToolResultMessage (existing results + resolved results) is placed on the request before calling the LLM

┌─────────────────────────────────────────────────────────────────┐
│                     REQUEST 1                                   │
│                                                                 │
│  Frontend ──► Prism ──► LLM                                     │
│                          │                                      │
│                          ▼                                      │
│                   LLM responds with                             │
│                   tool_calls: [                                 │
│                     { name: "delete_file", args: {path:...} }   │
│                   ]                                             │
│                          │                                      │
│                          ▼                                      │
│                    callTools()                                  │
│                          │                                      │
│               Tool has requiresApproval()?                      │
│                    │              │                             │
│                   NO             YES                            │
│                    │              │                             │
│                    ▼              ▼                             │
│             Execute tool    Emit tool-approval-request          │
│              normally       (instead of executing)              │
│                    │         Set hasPendingToolCalls = true     │
│                    │              │                             │
│                    └──────────────┤                             │
│                                   ▼                             │
│         Add AssistantMessage (with toolCalls)                   │
│         Add ToolResultMessage (server results only)             │
│                                   │                             │
│                                   ▼                             │
│         hasPendingToolCalls? ──YES──► STOP. Return response.    │
│                                       Process ENDS.             │
└─────────────────────────────────────────────────────────────────┘

          ⏳ Frontend shows approval UI to user...
          ⏳ User approves or denies...

┌─────────────────────────────────────────────────────────────────┐
│                  REQUEST 2                                      │
│                                                                 │
│  Frontend sends: full message history + tool-approval-response  │
│                          │                                      │
│                          ▼                                      │
│               Prism inspects last AssistantMessage              │
│               Finds pending tool_calls needing approval         │
│                    │              │                             │
│                APPROVED        DENIED (or skipped)              │
│                    │              │                             │
│                    ▼              ▼                             │
│             Execute tool    Add denial as                       │
│                    │        ToolResultMessage                   │
│                    ▼              │                             │
│         Add ToolResultMessage     │                             │
│         with actual result        │                             │
│                    │              │                             │
│                    └──────┬───────┘                             │
│                           ▼                                     │
│                  Send to LLM for further processing             │
└─────────────────────────────────────────────────────────────────┘

Usage Example

use Prism\Prism\Facades\Tool;

// Static approval (always requires approval)
$tool = Tool::as('delete_file')
    ->for('Delete a file from the filesystem')
    ->withStringParameter('path', 'File path to delete')
    ->using(fn (string $path): string => "Deleted: {$path}")
    ->requiresApproval();

// Dynamic approval (closure receives tool arguments)
$tool = Tool::as('transfer')
    ->for('Transfer money')
    ->withNumberParameter('amount', 'Amount to transfer')
    ->using(fn (float $amount): string => "Transferred {$amount}")
    ->requiresApproval(fn (array $args): bool => $args['amount'] > 1000);

Phase 2 continuation (client sends approval responses):

use Prism\Prism\ValueObjects\Messages\AssistantMessage;
use Prism\Prism\ValueObjects\Messages\ToolResultMessage;
use Prism\Prism\ValueObjects\Messages\UserMessage;
use Prism\Prism\ValueObjects\ToolApprovalResponse;
use Prism\Prism\ValueObjects\ToolCall;

$response = Prism::text()
    ->using('openai', 'gpt-4o')
    ->withTools([$tool])
    ->withMaxSteps(3)
    ->withMessages([
        new UserMessage('Delete /tmp/test.txt'),
        new AssistantMessage(
            content: '',
            toolCalls: [
                new ToolCall(id: 'call_123', name: 'delete_file', arguments: ['path' => '/tmp/test.txt']),
            ],
            toolApprovalRequests: [
                new ToolApprovalRequest(approvalId: 'call_123', toolCallId: 'call_123'),
            ],
        ),
        new ToolResultMessage([], [
            new ToolApprovalResponse(approvalId: 'call_123', approved: true),
        ]),
    ])
    ->asStream();

Breaking Changes

None. This is a backward-compatible addition. Existing tools with handlers continue to work exactly as before. Tools without requiresApproval() or clientExecuted()are unaffected.

vinitkadam03 · 2026-03-01T13:22:28Z

fixes: #921

emiliopedrollo · 2026-03-04T18:57:05Z

Can this work with a flow that uses previous_response_id and prompt instead of the full message history?

vinitkadam03 · 2026-03-08T20:13:23Z

Can this work with a flow that uses previous_response_id and prompt instead of the full message history?

@emiliopedrollo I haven't tested with previous_response_id specifically, but it should work. But, instead of full message history, you'd pass the response ID from the last turn that did not end in tool-calls (and stopped flow) along with only the AssistantMessage and ToolResultMessage:

$response = Prism::text()
    ->using('openai', 'gpt-4o')
    ->withTools([$tool])
    ->withMessages([
        new AssistantMessage('', $previousResponse->toolCalls, [], $previousResponse->toolApprovalRequests),
        new ToolResultMessage([], [new ToolApprovalResponse(approvalId: 'call_123', approved: true)]),
    ])
   // id of previous response that ended in stop, etc i.e non tool-calls and streaming completed.
    ->withProviderOptions(['previous_response_id' => $lastCompleteResponse->meta->id]) 
    ->asText();

The AssistantMessage is still needed even with previous_response_id as internally resolveToolApprovals logic needs it to match tool calls against approval responses and execute approved tools before sending the request. OpenAI's Responses API handles deduplication on its end.

sixlive · 2026-03-09T20:32:55Z

I had Claude do a first pass at this. Can you take a look and let me know what you think? I hype on this feature.

Bug

Anthropic Structured handler missing resolveToolApprovals

Anthropic\Handlers\Structured::handle() does not call $this->resolveToolApprovals($this->request) at the start, unlike every other Text handler. If a structured request with tools requires approval, Phase 2 will silently fail to resolve approvals.

Architecture

approvalId === toolCallId everywhere

ToolApprovalRequest has two fields: approvalId and toolCallId. But every single call site sets them to the same value:

new ToolApprovalRequest(approvalId: $tc->id, toolCallId: $tc->id)

If they're always identical, why have two fields? The PR description mentions "Vercel AI SDK format" but Prism isn't the Vercel AI SDK. Is there a real use case where they diverge? I'd love to hear it. Otherwise we should collapse them into a single identifier.

ToolResultMessage is being overloaded as an approval transport

The constructor now accepts $toolApprovalResponses:

public function __construct(
    public readonly array $toolResults = [],
    public readonly array $toolApprovalResponses = []
) {}

This conflates two responsibilities. A ToolResultMessage should contain tool results. Using it to also carry approval responses back makes the message semantics confusing. The resolveToolApprovalsAndYieldEvents method then has to do pretty complex surgery on messages (removing and re-adding ToolResultMessage instances). A dedicated message type would make the protocol clearer and the approval resolution code simpler.

bool &$hasPendingToolCalls passed by reference through multiple layers

A mutable boolean threaded through callTools -> callToolsAndYieldEvents -> filterServerExecutedToolCalls is hard to trace. Consider returning a result object instead:

readonly class ToolExecutionResult {
    public function __construct(
        public array $toolResults,
        public array $approvalRequests,
        public bool $hasPendingToolCalls,
    ) {}
}

clientExecuted() sets $this->fn = null which is the same as the default state

isClientExecuted() checks $this->fn === null, which is also the initial state of a new Tool() before using() is called. So a tool where someone forgot to call using() is indistinguishable from an intentionally client-executed tool. The "implicit declaration" path documented in the PR is more likely to mask bugs than help.

I'd suggest using a dedicated bool $clientExecuted flag. Then a tool with no handler and no clientExecuted() call can throw at validation time rather than silently behaving as client-executed.

Repeated boilerplate across all providers

Every provider Text handler now has this identical block:

$hasPendingToolCalls = false;
$approvalRequests = [];
$toolResults = $this->callTools($request->tools(), $toolCalls, $hasPendingToolCalls, $approvalRequests);

$toolApprovalRequests = array_map(
    fn (ToolCall $tc): ToolApprovalRequest => new ToolApprovalRequest(
        approvalId: $tc->id, toolCallId: $tc->id
    ),
    $approvalRequests,
);

This ToolCall[] -> ToolApprovalRequest[] mapping should live in callTools itself (or a helper on the trait). The providers shouldn't need to know about this transformation.

Security

Dynamic approval closures receive raw LLM arguments

->requiresApproval(fn (array $args): bool => $args['amount'] > 1000)

The $args come directly from LLM output (parsed JSON). The documentation should explicitly warn that these arguments are untrusted LLM output and should be validated before use in any sensitive context.

Style

ToolResultMessage default changed from required to optional

Making $toolResults default to [] enables new ToolResultMessage with no args (used in the approval flow). But it also means code that used to require tool results now silently accepts empty messages. This is a subtle contract change worth being intentional about.

Vercel SDK references in internal docblocks

/**
 * Vercel AI SDK compatible format: { approvalId, type: 'tool-approval-response', approved }
 */

Prism's internal value objects shouldn't couple their documentation to third-party protocols. If Vercel compatibility is a goal, note it in the docs rather than the class docblock.

Testing

Tests cover the happy paths well for client-executed tools, mixed tools, and streaming. I'd like to see coverage for:

Phase 2 approval resolution with mixed approved + denied tools in the same request
A tool with requiresApproval(fn ($args) => ...) where the closure returns false (tool should execute normally)
Approval resolution when the message history is malformed (no AssistantMessage, or ToolResultMessage before AssistantMessage)
The Structured handler path for approval tools
Concurrent tool execution where some tools require approval and some are concurrent-capable

Summary

Priority	Issue
Bug	Anthropic Structured handler missing `resolveToolApprovals`
High	Replace `$fn === null` check with explicit `$clientExecuted` flag
High	Collapse `approvalId`/`toolCallId` or justify the distinction
High	Extract approval boilerplate from provider handlers into the trait
Medium	Consider a dedicated approval message type instead of overloading `ToolResultMessage`
Medium	Replace `bool &$hasPendingToolCalls` with a result object
Medium	Add edge-case tests listed above
Low	Remove Vercel SDK references from internal docblocks
Low	Document that approval closure args are untrusted LLM output

… requests

…l handler check

vinitkadam03 · 2026-03-10T20:49:29Z

Bug: Anthropic Structured handler missing resolveToolApprovals
Good catch! This was actually missing from all four Structured handlers (Anthropic, OpenAI, Gemini, OpenRouter), not just Anthropic. Added resolveToolApprovals at the top of handle() in each, along with Phase 2 tests to cover the flow.

High: Replace $fn === null check with explicit $clientExecuted flag
Addressed -- isClientExecuted() now checks a dedicated bool $clientExecuted flag. A tool where using() was forgotten is no longer silently treated as client-executed. It's also caught early via Tool::ensureRunnable() in toRequest(), so a misconfigured tool throws immediately at request build time with a clear message. Removed the "Implicit Declaration" section from the docs and updated all tests to use explicit ->clientExecuted().

High: Collapse approvalId/toolCallId or justify the distinction
Good call -- Initially I kept the approvalId the same as the toolCallId since it's already unique. But separating them gives us a clearer distinction -- the toolCallId identifies the LLM's tool call, while the approvalId identifies the approval request/response flow. They have different lifecycles and could diverge. approvalId is now generated independently via EventID::generate('apr'), producing a Prism-owned ID (e.g. apr_01J5X...) distinct from the LLM-provided toolCallId. This also future-proofs re-approval scenarios where the same toolCallId could have multiple approval requests with different approvalIds.

High: Extract approval boilerplate from provider handlers into the trait
Addressed as part of the approvalId separation changes -- callTools() now returns ToolApprovalRequest[] directly, so the array_map boilerplate is gone from all providers.

Medium: Consider a dedicated approval message type instead of overloading ToolResultMessage
Kept unified for now -- both approval responses and tool results relate to the same assistant turn's tool calls and occupy the same conversational slot. Approval responses are transient (they become tool results after resolution), and the resolution complexity comes from the workflow itself rather than the message type. Was thinking of renaming ToolResultMessage to ToolMessage but that will be a breaking change. Definitely open to revisiting if you think otherwise.

Medium: Replace bool &$hasPendingToolCalls with a result object
PHP generators do support return values, so technically feasible. That said, all three methods live in the CallsTools trait and $hasPendingToolCalls is only ever set in one place (filterServerExecutedToolCalls), so the scope stays contained. The refactor would touch 18+ handler files for a modest readability win. Keeping the pass-by-reference approach for now, but worth revisiting if the trait grows more complex. Open to revisiting this if you think otherwise.

Medium: Add edge-case tests
Two of the five were already covered (mixed approved+denied in Phase 2, and dynamic closure returning false). Added the missing three: malformed message history edge cases (no AssistantMessage, ToolResultMessage before AssistantMessage), and concurrent tools mixed with approval-required tools. The structured handler path is covered by provider-level integration tests across Anthropic, OpenAI, Gemini, and OpenRouter.

Low: Remove Vercel SDK references from internal docblocks
Removed!

Low: Document that approval closure args are untrusted LLM output
Fair thought, though this applies to all tool arguments in general, not just the approval closure. Feels like it might be out of scope for this PR, but happy to hear thoughts!

…th approval, and request payload validation fix

vinitkadam03 marked this pull request as draft March 1, 2026 13:14

vinitkadam03 changed the title ~~Feat/client executed tools and tool approval~~ Feat: 🛠️ Add client-executed tools & human-in-the-loop tool approval flow Mar 1, 2026

vinitkadam03 force-pushed the feat/client-executed-tools-and-tool-approval branch 2 times, most recently from 7483218 to c1840a7 Compare March 1, 2026 13:20

vinitkadam03 mentioned this pull request Mar 1, 2026

Feat: client executed tools #880

Closed

vinitkadam03 marked this pull request as ready for review March 1, 2026 13:23

vinitkadam03 mentioned this pull request Mar 1, 2026

Human-in-the-loop / Tool confirmation #921

Open

vinitkadam03 marked this pull request as draft March 1, 2026 16:19

vinitkadam03 force-pushed the feat/client-executed-tools-and-tool-approval branch 3 times, most recently from 0dc55ee to a04633b Compare March 1, 2026 16:53

vinitkadam03 marked this pull request as ready for review March 1, 2026 17:03

vinitkadam03 added 2 commits March 2, 2026 12:30

feat: client executed tools

4f08a34

feat: tool approval flow

ac64d0a

vinitkadam03 force-pushed the feat/client-executed-tools-and-tool-approval branch from a04633b to ac64d0a Compare March 2, 2026 07:02

Update and fix new openrouter structured handler and add tests

450cf5d

vinitkadam03 added 4 commits March 10, 2026 23:49

fix: Add missing resolveToolApprovals to structured handlers

12d8a17

Generate unique approvalId separate from toolCallId for tool approval…

461ebcd

… requests

refactor: use dedicated flag for client-executed tools instead of nul…

6b0ef1a

…l handler check

feat: validate tool configuration at request build time

c364664

test: add coverage for malformed message history, concurrent tools wi…

aa3c746

…th approval, and request payload validation fix

vinitkadam03 force-pushed the feat/client-executed-tools-and-tool-approval branch from ebf2088 to aa3c746 Compare March 10, 2026 20:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: 🛠️ Add client-executed tools & human-in-the-loop tool approval flow#932

Feat: 🛠️ Add client-executed tools & human-in-the-loop tool approval flow#932
vinitkadam03 wants to merge 8 commits intoprism-php:mainfrom
vinitkadam03:feat/client-executed-tools-and-tool-approval

vinitkadam03 commented Mar 1, 2026 •

edited

Loading

Uh oh!

vinitkadam03 commented Mar 1, 2026 •

edited

Loading

Uh oh!

emiliopedrollo commented Mar 4, 2026

Uh oh!

vinitkadam03 commented Mar 8, 2026 •

edited

Loading

Uh oh!

sixlive commented Mar 9, 2026

Uh oh!

vinitkadam03 commented Mar 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

vinitkadam03 commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. Client Executed Tools

Motivation

Usage Example

2. Tool Approval Flow

Motivation

Phase 1: Approval Request (stream stops)

Phase 2: Approval Resolution (tool executes, LLM continues)

Usage Example

Breaking Changes

Uh oh!

vinitkadam03 commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

emiliopedrollo commented Mar 4, 2026

Uh oh!

vinitkadam03 commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sixlive commented Mar 9, 2026

Bug

Architecture

Security

Style

Testing

Summary

Uh oh!

vinitkadam03 commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vinitkadam03 commented Mar 1, 2026 •

edited

Loading

vinitkadam03 commented Mar 1, 2026 •

edited

Loading

vinitkadam03 commented Mar 8, 2026 •

edited

Loading

vinitkadam03 commented Mar 10, 2026 •

edited

Loading