-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Python: Add chat history reducer support to GroupChatOrchestration #13933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,81 @@ | ||
| # Filtering Samples | ||
|
|
||
| This directory contains samples demonstrating the Python filter system in Semantic Kernel. | ||
|
|
||
| Filters allow you to intercept and modify pipeline execution at specific points. The Python SDK supports three filter types. | ||
|
|
||
| ## Filter Types | ||
|
|
||
| | Filter | Decorator | Purpose | | ||
| |--------|-----------|---------| | ||
| | **Prompt Rendering** | `@kernel.filter(FilterTypes.PROMPT_RENDERING)` | Intercept before/after prompt is rendered | | ||
| | **Function Invocation** | `@kernel.filter(FilterTypes.FUNCTION_INVOCATION)` | Intercept before/after any function call | | ||
| | **Auto Function Invoke** | `@kernel.filter(FilterTypes.AUTO_FUNCTION_INVOCATION)` | Control automatic tool calls | | ||
|
|
||
| ## Samples | ||
|
|
||
| ### [prompt_filters.py](./prompt_filters.py) | ||
|
|
||
| Basic prompt rendering filter. Demonstrates how to inspect and modify prompts before they're sent to the model. | ||
|
|
||
| ### [function_invocation_filters.py](./function_invocation_filters.py) | ||
|
|
||
| Function invocation filter with logging and exception handling. Shows both `kernel.add_filter()` and `@kernel.filter()` decorator registration. | ||
|
|
||
| ### [function_invocation_filters_stream.py](./function_invocation_filters_stream.py) | ||
|
|
||
| Same as above but for **streaming** responses. Use this when working with streaming chat completions. | ||
|
|
||
| ### [auto_function_invoke_filters.py](./auto_function_invoke_filters.py) | ||
|
|
||
| Controls which automatic tool calls are allowed. Demonstrates `context.terminate = True` to terminate the auto function calling loop, and `FunctionResultContent` handling. | ||
|
|
||
| ### [retry_with_filters.py](./retry_with_filters.py) | ||
|
|
||
| Implements automatic retry logic using function invocation filters. Retries on failure with a different model. | ||
|
|
||
| ### [retry_with_different_model.py](./retry_with_different_model.py) | ||
|
|
||
| Similar retry pattern but specifically switches to a fallback model on failure. | ||
|
|
||
| ## Registration | ||
|
|
||
| There are two ways to register a filter: | ||
|
|
||
| ```python | ||
| # Method 1: Decorator | ||
| @kernel.filter(filter_type=FilterTypes.FUNCTION_INVOCATION) | ||
| async def my_filter(context, next): | ||
| await next(context) | ||
|
|
||
| # Method 2: Add function | ||
| kernel.add_filter("function_invocation", my_filter) | ||
| ``` | ||
|
|
||
| Both are equivalent. Use whichever fits your code style. | ||
|
|
||
| ## Filter Signature | ||
|
|
||
| All filters follow the same signature: | ||
|
|
||
| ```python | ||
| async def filter_name(context: <ContextType>, next: Callable) -> None: | ||
| # Code before next filter/function runs | ||
| await next(context) | ||
| # Code after next filter/function runs | ||
| ``` | ||
|
|
||
| The `next` callable passes control to the next filter in the chain, then to the actual function. You can skip execution by not calling `await next(context)`. | ||
|
|
||
| ## Terminating Auto Function Calls | ||
|
|
||
| To prevent an auto-invoked function from executing: | ||
|
|
||
| ```python | ||
| @kernel.filter(FilterTypes.AUTO_FUNCTION_INVOCATION) | ||
| async def selective_filter(context: AutoFunctionInvocationContext, next): | ||
| if context.function.name == "dangerous_function": | ||
| context.terminate = True # Terminate the auto function calling loop | ||
| return | ||
| await next(context) | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,6 +18,7 @@ | |
| from semantic_kernel.agents.runtime.core.topic import TopicId | ||
| from semantic_kernel.agents.runtime.in_process.type_subscription import TypeSubscription | ||
| from semantic_kernel.contents.chat_history import ChatHistory | ||
| from semantic_kernel.contents.history_reducer.chat_history_reducer import ChatHistoryReducer | ||
| from semantic_kernel.contents.chat_message_content import ChatMessageContent | ||
| from semantic_kernel.contents.streaming_chat_message_content import StreamingChatMessageContent | ||
| from semantic_kernel.contents.utils.author_role import AuthorRole | ||
|
|
@@ -259,6 +260,7 @@ def __init__( | |
| participant_descriptions: dict[str, str], | ||
| exception_callback: Callable[[BaseException], None], | ||
| result_callback: Callable[[DefaultTypeAlias], Awaitable[None]] | None = None, | ||
| chat_history_reducer: ChatHistoryReducer | None = None, | ||
| ): | ||
| """Initialize the group chat manager actor. | ||
|
|
||
|
|
@@ -271,7 +273,10 @@ def __init__( | |
| """ | ||
| self._manager = manager | ||
| self._internal_topic_type = internal_topic_type | ||
| self._chat_history = ChatHistory() | ||
| self._chat_history: ChatHistory = ChatHistory() | ||
| self._chat_history_reducer = chat_history_reducer | ||
| if chat_history_reducer is not None and chat_history_reducer.messages: | ||
| self._chat_history.messages = list(chat_history_reducer.messages) | ||
| self._participant_descriptions = participant_descriptions | ||
| self._result_callback = result_callback | ||
|
|
||
|
|
@@ -301,9 +306,27 @@ async def _handle_response_message(self, message: GroupChatResponseMessage, ctx: | |
| ) | ||
| ) | ||
| self._chat_history.add_message(message.body) | ||
| await self._maybe_reduce_chat_history() | ||
|
|
||
| await self._determine_state_and_take_action(ctx.cancellation_token) | ||
|
Comment on lines
306
to
311
|
||
|
|
||
| async def _maybe_reduce_chat_history(self) -> None: | ||
| """Reduce the chat history if a reducer is configured and the threshold is exceeded. | ||
|
|
||
| The reducer operates on the manager\'s internal chat history, which is used | ||
| for agent selection and termination decisions. Note that this does not reduce | ||
| the history passed to individual agent LLM calls — that would require reducer | ||
| support at the AgentThread level, which is a separate enhancement. | ||
| """ | ||
| if self._chat_history_reducer is not None: | ||
| self._chat_history_reducer.messages = list(self._chat_history.messages) | ||
| result = await self._chat_history_reducer.reduce() | ||
|
Comment on lines
+321
to
+323
|
||
| if result is not None: | ||
| self._chat_history.messages = result.messages | ||
| logger.debug( | ||
| f"Chat history reduced to {len(self._chat_history.messages)} messages." | ||
| ) | ||
|
|
||
| @ActorBase.exception_handler | ||
| async def _determine_state_and_take_action(self, cancellation_token: CancellationToken) -> None: | ||
| """Determine the state of the group chat and take action accordingly.""" | ||
|
|
@@ -377,6 +400,7 @@ def __init__( | |
| agent_response_callback: Callable[[DefaultTypeAlias], Awaitable[None] | None] | None = None, | ||
| streaming_agent_response_callback: Callable[[StreamingChatMessageContent, bool], Awaitable[None] | None] | ||
| | None = None, | ||
| chat_history_reducer: ChatHistoryReducer | None = None, | ||
| ) -> None: | ||
| """Initialize the group chat orchestration. | ||
|
|
||
|
|
@@ -392,8 +416,12 @@ def __init__( | |
| by the agents. | ||
| streaming_agent_response_callback (Callable | None): A function that is called when a streaming response | ||
| is produced by the agents. | ||
| chat_history_reducer (ChatHistoryReducer | None): An optional reducer to summarize or truncate | ||
| the chat history during the group chat. When provided, the reducer is called after each | ||
| agent response to keep the history within bounds. | ||
| """ | ||
| self._manager = manager | ||
| self._chat_history_reducer = chat_history_reducer | ||
|
|
||
| for member in members: | ||
| if member.description is None: | ||
|
|
@@ -496,6 +524,7 @@ async def _register_manager( | |
| participant_descriptions={agent.name: agent.description for agent in self._members}, # type: ignore[misc] | ||
| exception_callback=exception_callback, | ||
| result_callback=result_callback, | ||
| chat_history_reducer=self._chat_history_reducer, | ||
| ), | ||
| ) | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The markdown table in “Filter Types” uses double pipes (
||) at the start/end of rows, which renders incorrectly in many Markdown parsers (extra empty columns). Use standard single-pipe table syntax (e.g.,| Filter | Decorator | Purpose |) for consistent rendering on GitHub.