-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Description
Summary
When building interactive AI agent applications (chat UIs, coding assistants), users often need to send follow-up messages while the agent is executing tool calls. For example, a user asks a
question, the agent starts calling tools to research the answer, and the user realizes they want to add context: "focus on the compositor code" — while tools are still running.
Currently, ChatMiddleware only runs once per outer get_response() call. The tool loop inside FunctionInvocationLayer.get_response() calls super().get_response() which bypasses the
ChatMiddlewareLayer, so ChatMiddleware.process() never fires again for subsequent LLM calls within the same tool loop.
FunctionMiddleware runs for every tool invocation, but FunctionInvocationContext has no access to prepped_messages — the accumulated message list that gets sent to the LLM on the next iteration.
This means there is no middleware-level mechanism to inject user messages into the conversation during the tool loop.
The Problem
agent.run("Analyze this codebase")
│
|─ ChatMiddlewareLayer.get_response()
│ └─ ChatMiddleware pipeline runs (ONCE) ← injection works here
│ └─ FunctionInvocationLayer.get_response()
│ ├─ LLM call #1 → returns tool calls
│ ├─ FunctionMiddleware runs → tool executes ← no message access
│ ├─ LLM call #2 (super().get_response) → bypasses ChatMiddleware!
│ ├─ FunctionMiddleware runs → tool executes ← no message access
│ ├─ LLM call #3 → bypasses ChatMiddleware!
│ └─ Final response
A user message queued after LLM call #1 will never be delivered — ChatMiddleware won't run again, and FunctionMiddleware can't inject into the message list.
Proposed Solution
Expose prepped_messages (the mutable message list that accumulates through the tool loop) to FunctionMiddleware via the existing kwargs mechanism in FunctionInvocationContext. This requires
minimal changes:
In _tools.py, pass prepped_messages into the execution context:
# In FunctionInvocationLayer.get_response(), when creating execute_function_calls:
execute_function_calls = partial(
_execute_function_calls,
custom_args=additional_function_arguments,
config=self.function_invocation_configuration,
middleware_pipeline=function_middleware_pipeline,
messages=prepped_messages, # ← NEW: pass reference
)In _execute_function_calls, thread it into context kwargs:
# When creating FunctionInvocationContext:
context = FunctionInvocationContext(
function=tool,
arguments=validated_args,
kwargs={**runtime_kwargs, "_messages": messages}, # ← NEW
)This is fully backward-compatible:
- No class definition changes (FunctionInvocationContext already has kwargs: dict)
- No interface changes
- Existing FunctionMiddleware implementations are unaffected (they don't read _messages)
Usage Example
With this change, a FunctionMiddleware can inject user messages during the tool loop:
from agent_framework import FunctionMiddleware, FunctionInvocationContext, Message
class UserInjectionMiddleware(FunctionMiddleware):
"""Injects queued user messages into the conversation during tool execution."""
def __init__(self):
self._pending: list[str] = []
def queue(self, text: str) -> None:
self._pending.append(text)
async def process(
self,
context: FunctionInvocationContext,
call_next,
) -> None:
await call_next() # Let the tool execute normally
# After tool execution, inject any pending user messages
messages = context.kwargs.get("_messages")
if messages is not None and self._pending:
for text in self._pending:
messages.append(Message("user", [text]))
self._pending.clear()Now user messages injected during tool execution appear in prepped_messages before the next LLM call, and the model sees them naturally as part of the conversation.
Alternatives Considered
- Hack context.result: Append user text to tool results. Works but conflates user messages with tool output, confusing the model.
- Custom tool loop: Re-implement FunctionInvocationLayer's loop in application code. Too invasive and fragile across framework updates.
- Make tool loop call through ChatMiddlewareLayer: Would fix injection but significantly changes the execution model and may have performance implications.
Environment
- agent-framework-core version: 1.0.0rc4
- Python: 3.13
Language/SDK
python