Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions python/samples/concepts/filtering/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Filtering Samples

This directory contains samples demonstrating the Python filter system in Semantic Kernel.

Filters allow you to intercept and modify pipeline execution at specific points. The Python SDK supports three filter types.

## Filter Types

| Filter | Decorator | Purpose |
|--------|-----------|---------|
| **Prompt Rendering** | `@kernel.filter(FilterTypes.PROMPT_RENDERING)` | Intercept before/after prompt is rendered |
| **Function Invocation** | `@kernel.filter(FilterTypes.FUNCTION_INVOCATION)` | Intercept before/after any function call |
| **Auto Function Invoke** | `@kernel.filter(FilterTypes.AUTO_FUNCTION_INVOCATION)` | Control automatic tool calls |

Comment on lines +9 to +14
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markdown table in “Filter Types” uses double pipes (||) at the start/end of rows, which renders incorrectly in many Markdown parsers (extra empty columns). Use standard single-pipe table syntax (e.g., | Filter | Decorator | Purpose |) for consistent rendering on GitHub.

Copilot uses AI. Check for mistakes.
## Samples

### [prompt_filters.py](./prompt_filters.py)

Basic prompt rendering filter. Demonstrates how to inspect and modify prompts before they're sent to the model.

### [function_invocation_filters.py](./function_invocation_filters.py)

Function invocation filter with logging and exception handling. Shows both `kernel.add_filter()` and `@kernel.filter()` decorator registration.

### [function_invocation_filters_stream.py](./function_invocation_filters_stream.py)

Same as above but for **streaming** responses. Use this when working with streaming chat completions.

### [auto_function_invoke_filters.py](./auto_function_invoke_filters.py)

Controls which automatic tool calls are allowed. Demonstrates `context.terminate = True` to terminate the auto function calling loop, and `FunctionResultContent` handling.

### [retry_with_filters.py](./retry_with_filters.py)

Implements automatic retry logic using function invocation filters. Retries on failure with a different model.

### [retry_with_different_model.py](./retry_with_different_model.py)

Similar retry pattern but specifically switches to a fallback model on failure.

## Registration

There are two ways to register a filter:

```python
# Method 1: Decorator
@kernel.filter(filter_type=FilterTypes.FUNCTION_INVOCATION)
async def my_filter(context, next):
await next(context)

# Method 2: Add function
kernel.add_filter("function_invocation", my_filter)
```

Both are equivalent. Use whichever fits your code style.

## Filter Signature

All filters follow the same signature:

```python
async def filter_name(context: <ContextType>, next: Callable) -> None:
# Code before next filter/function runs
await next(context)
# Code after next filter/function runs
```

The `next` callable passes control to the next filter in the chain, then to the actual function. You can skip execution by not calling `await next(context)`.

## Terminating Auto Function Calls

To prevent an auto-invoked function from executing:

```python
@kernel.filter(FilterTypes.AUTO_FUNCTION_INVOCATION)
async def selective_filter(context: AutoFunctionInvocationContext, next):
if context.function.name == "dangerous_function":
context.terminate = True # Terminate the auto function calling loop
return
await next(context)
```
31 changes: 30 additions & 1 deletion python/semantic_kernel/agents/orchestration/group_chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from semantic_kernel.agents.runtime.core.topic import TopicId
from semantic_kernel.agents.runtime.in_process.type_subscription import TypeSubscription
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.contents.history_reducer.chat_history_reducer import ChatHistoryReducer
from semantic_kernel.contents.chat_message_content import ChatMessageContent
from semantic_kernel.contents.streaming_chat_message_content import StreamingChatMessageContent
from semantic_kernel.contents.utils.author_role import AuthorRole
Expand Down Expand Up @@ -259,6 +260,7 @@ def __init__(
participant_descriptions: dict[str, str],
exception_callback: Callable[[BaseException], None],
result_callback: Callable[[DefaultTypeAlias], Awaitable[None]] | None = None,
chat_history_reducer: ChatHistoryReducer | None = None,
):
"""Initialize the group chat manager actor.

Expand All @@ -271,7 +273,10 @@ def __init__(
"""
self._manager = manager
self._internal_topic_type = internal_topic_type
self._chat_history = ChatHistory()
self._chat_history: ChatHistory = ChatHistory()
self._chat_history_reducer = chat_history_reducer
if chat_history_reducer is not None and chat_history_reducer.messages:
self._chat_history.messages = list(chat_history_reducer.messages)
self._participant_descriptions = participant_descriptions
self._result_callback = result_callback

Expand Down Expand Up @@ -301,9 +306,27 @@ async def _handle_response_message(self, message: GroupChatResponseMessage, ctx:
)
)
self._chat_history.add_message(message.body)
await self._maybe_reduce_chat_history()

await self._determine_state_and_take_action(ctx.cancellation_token)
Comment on lines 306 to 311
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reducer is currently applied only to GroupChatManagerActor._chat_history, but agent invocations build prompts from each agent actor’s AgentThread (e.g., ChatCompletionAgent reconstructs the request history from thread.get_messages()), not from the manager’s _chat_history. As a result, reducing the manager history won’t reduce what gets sent to the model, so this likely won’t address #12303. To make the reducer effective for LLM calls, it needs to be applied to the per-agent thread/chat history path (e.g., construct ChatHistoryAgentThread with a ChatHistoryReducer, or call a thread/history reduction step for each agent actor before invoking the model).

Copilot uses AI. Check for mistakes.

async def _maybe_reduce_chat_history(self) -> None:
"""Reduce the chat history if a reducer is configured and the threshold is exceeded.

The reducer operates on the manager\'s internal chat history, which is used
for agent selection and termination decisions. Note that this does not reduce
the history passed to individual agent LLM calls — that would require reducer
support at the AgentThread level, which is a separate enhancement.
"""
if self._chat_history_reducer is not None:
self._chat_history_reducer.messages = list(self._chat_history.messages)
result = await self._chat_history_reducer.reduce()
Comment on lines +321 to +323
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_maybe_reduce_chat_history overwrites self._chat_history_reducer.messages with the manager’s internal history (self._chat_history.messages). This will discard any messages already present on the reducer instance (e.g., user-configured system/developer prompts added via add_system_message, or pre-seeded summaries), which changes reducer behavior and can make summarization prompts silently disappear. Consider seeding self._chat_history from chat_history_reducer.messages once in __init__ (deep-copy), and then keep both in sync without clobbering reducer state, or merge/preserve reducer-prefixed system/developer messages when syncing.

Copilot uses AI. Check for mistakes.
if result is not None:
self._chat_history.messages = result.messages
logger.debug(
f"Chat history reduced to {len(self._chat_history.messages)} messages."
)

@ActorBase.exception_handler
async def _determine_state_and_take_action(self, cancellation_token: CancellationToken) -> None:
"""Determine the state of the group chat and take action accordingly."""
Expand Down Expand Up @@ -377,6 +400,7 @@ def __init__(
agent_response_callback: Callable[[DefaultTypeAlias], Awaitable[None] | None] | None = None,
streaming_agent_response_callback: Callable[[StreamingChatMessageContent, bool], Awaitable[None] | None]
| None = None,
chat_history_reducer: ChatHistoryReducer | None = None,
) -> None:
"""Initialize the group chat orchestration.

Expand All @@ -392,8 +416,12 @@ def __init__(
by the agents.
streaming_agent_response_callback (Callable | None): A function that is called when a streaming response
is produced by the agents.
chat_history_reducer (ChatHistoryReducer | None): An optional reducer to summarize or truncate
the chat history during the group chat. When provided, the reducer is called after each
agent response to keep the history within bounds.
"""
self._manager = manager
self._chat_history_reducer = chat_history_reducer

for member in members:
if member.description is None:
Expand Down Expand Up @@ -496,6 +524,7 @@ async def _register_manager(
participant_descriptions={agent.name: agent.description for agent in self._members}, # type: ignore[misc]
exception_callback=exception_callback,
result_callback=result_callback,
chat_history_reducer=self._chat_history_reducer,
),
)

Expand Down
Loading