Skip to content

Python: Add chat history reducer support to GroupChatOrchestration#13933

Open
lawcontinue wants to merge 3 commits intomicrosoft:mainfrom
lawcontinue:fix/group-chat-history-reducer
Open

Python: Add chat history reducer support to GroupChatOrchestration#13933
lawcontinue wants to merge 3 commits intomicrosoft:mainfrom
lawcontinue:fix/group-chat-history-reducer

Conversation

@lawcontinue
Copy link
Copy Markdown

Description

GroupChatOrchestration manages its own internal ChatHistory and passes messages to agents via AgentThread. This means a ChatHistorySummarizationReducer configured on the agent is never triggered — the messages never flow through the reducer path.

Fixes #12303.

Changes

Add an optional chat_history_reducer parameter to GroupChatOrchestration. When provided, the reducer is called after each agent response to keep the internal history within bounds.

from semantic_kernel.contents.history_reducer.chat_history_summarization_reducer import ChatHistorySummarizationReducer

reducer = ChatHistorySummarizationReducer(
    service=kernel.get_service(),
    target_count=10,
    threshold_count=5,
)

orchestration = GroupChatOrchestration(
    members=[agent1, agent2],
    manager=RoundRobinGroupChatManager(max_rounds=10),
    chat_history_reducer=reducer,
)

What changed in group_chat.py

Change Detail
Import Added ChatHistoryReducer
GroupChatManagerActor.__init__ Accepts optional chat_history_reducer
_handle_response_message Calls _maybe_reduce_chat_history() after adding each message
_maybe_reduce_chat_history New method: syncs messages into the reducer, calls reduce(), syncs back
GroupChatOrchestration.__init__ Accepts optional chat_history_reducer param
_register_manager Passes reducer to GroupChatManagerActor

Fully backward compatible — default is None (no reducer), matching existing behavior.

Why this approach

The _chat_history in GroupChatManagerActor is a plain ChatHistory. Rather than replacing it with a ChatHistoryReducer subclass (which would require the reducer to own the history lifecycle), this PR keeps the internal history as-is and calls the reducer explicitly after each agent response. This avoids changing the message ownership model and works with both summarization and truncation reducers.

Checklist

  • Changes are backward compatible
  • No breaking changes to public API

Clarify that context.terminate = True terminates the auto function calling loop, not just skips a function call.
Allow GroupChatOrchestration to accept an optional ChatHistoryReducer
that automatically summarizes or truncates the internal chat history
after each agent response.

This fixes microsoft#12303 where ChatHistorySummarizationReducer was accepted
by the API but never triggered because GroupChatOrchestration manages
its own ChatHistory internally, bypassing the reducer path.

Changes:
- Add chat_history_reducer param to GroupChatOrchestration.__init__
- Add _maybe_reduce_chat_history to GroupChatManagerActor
- Call reducer after each agent response in _handle_response_message
- Fully backward compatible: default is None (no reducer)
Copilot AI review requested due to automatic review settings April 29, 2026 13:08
@lawcontinue lawcontinue requested a review from a team as a code owner April 29, 2026 13:08
@moonbox3 moonbox3 added python Pull requests for the Python Semantic Kernel documentation labels Apr 29, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 89%

✓ Correctness

This PR adds optional chat history reduction support to the GroupChat orchestration. The implementation correctly syncs messages from the internal ChatHistory to the reducer, calls reduce(), and updates the history when reduction occurs. The import, parameter threading through GroupChat → _GroupChatManagerActor, and the placement of the reduction call (after adding a message but before determining next action) are all correct. The README is documentation-only and accurate. No correctness issues found.

✓ Security Reliability

The changes add an optional ChatHistoryReducer to the group chat orchestration, called after each agent response. The README addition is documentation-only. The code change is generally sound—both existing reducer implementations create new list objects during reduction, so the message-list aliasing pattern works correctly. One reliability concern: the _maybe_reduce_chat_history() call sits outside the ``@ActorBase.exception_handler decorator that protects `_determine_state_and_take_action`, so a reducer exception (e.g., a failed summarization network call when `fail_on_error=True`) would bypass the `exception_callback` and prevent the group chat from progressing.

✓ Test Coverage

The PR adds a chat_history_reducer parameter and _maybe_reduce_chat_history() method to the group chat orchestration, but includes zero tests for this new behavior. The existing test file python/tests/unit/agents/orchestration/test_group_chat.py contains no tests exercising the reducer integration — not for the constructor parameter, not for the reduction logic after agent responses, and not for edge cases like a reducer that returns None. The README addition is documentation-only and fine.

✗ Design Approach

The new history-reduction hook is a sensible feature, but it is wired in a way that breaks the orchestration’s existing per-invocation isolation. GroupChatOrchestration now keeps a single mutable ChatHistoryReducer on the orchestration instance and passes that same object into each manager actor, even though OrchestrationBase.invoke() creates a fresh internal topic per run specifically to isolate concurrent/repeated invocations. Because reducers carry mutable messages state and the manager mutates that state before every reduction, one orchestration instance can now leak summarized/truncated history across runs.

Flagged Issues

  • python/semantic_kernel/agents/orchestration/group_chat.py shares one mutable ChatHistoryReducer instance across all invocations of a GroupChatOrchestration, conflicting with per-run isolation in orchestration_base.py:214-223. Since _maybe_reduce_chat_history() mutates reducer state by assigning self._chat_history_reducer.messages before each reduction, concurrent or repeated invocations will leak summarized/truncated history across runs. The reducer should be deep-copied per manager actor/run.

Suggestions

  • Wrap the _maybe_reduce_chat_history() call (or its body) with exception handling so that reducer failures are reported via exception_callback and do not stall the group chat. ChatHistorySummarizationReducer.reduce() can raise ChatHistoryReducerException (chat_history_summarization_reducer.py:178), and that exception would currently propagate out of _handle_response_message without invoking exception_callback or reaching _determine_state_and_take_action.

Automated review by lawcontinue's agents

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds optional chat history reduction support to GroupChatOrchestration/GroupChatManagerActor so a configured ChatHistoryReducer can be invoked to keep history bounded, and introduces a new README for Python filtering samples.

Changes:

  • Add optional chat_history_reducer parameter to GroupChatOrchestration and propagate it into GroupChatManagerActor.
  • Invoke reducer logic after each agent response via a new _maybe_reduce_chat_history() helper.
  • Add python/samples/concepts/filtering/README.md documenting available filter sample scripts and registration patterns.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
python/semantic_kernel/agents/orchestration/group_chat.py Adds reducer plumb-through + reduction call after responses to bound internal manager chat history.
python/samples/concepts/filtering/README.md New documentation describing Python filter types, sample files, and registration/signature patterns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 304 to 309
)
)
self._chat_history.add_message(message.body)
await self._maybe_reduce_chat_history()

await self._determine_state_and_take_action(ctx.cancellation_token)
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reducer is currently applied only to GroupChatManagerActor._chat_history, but agent invocations build prompts from each agent actor’s AgentThread (e.g., ChatCompletionAgent reconstructs the request history from thread.get_messages()), not from the manager’s _chat_history. As a result, reducing the manager history won’t reduce what gets sent to the model, so this likely won’t address #12303. To make the reducer effective for LLM calls, it needs to be applied to the per-agent thread/chat history path (e.g., construct ChatHistoryAgentThread with a ChatHistoryReducer, or call a thread/history reduction step for each agent actor before invoking the model).

Copilot uses AI. Check for mistakes.
Comment on lines +311 to +315
async def _maybe_reduce_chat_history(self) -> None:
"""Reduce the chat history if a reducer is configured and the threshold is exceeded."""
if self._chat_history_reducer is not None:
# Sync messages from the internal history to the reducer
self._chat_history_reducer.messages = self._chat_history.messages
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior is introduced by calling _maybe_reduce_chat_history() after each agent response, but there are existing unit tests for GroupChatOrchestration/RoundRobinGroupChatManager and none appear to cover reducer integration. Please add tests that verify (1) the reducer is invoked when provided and (2) the manager/agent history used for subsequent turns is actually reduced (and that seeded system/developer messages on the reducer, if supported, are preserved).

Suggested change
async def _maybe_reduce_chat_history(self) -> None:
"""Reduce the chat history if a reducer is configured and the threshold is exceeded."""
if self._chat_history_reducer is not None:
# Sync messages from the internal history to the reducer
self._chat_history_reducer.messages = self._chat_history.messages
def _get_seeded_reducer_messages(self) -> list[ChatMessageContent]:
"""Return reducer-seeded system/developer messages that should be preserved during sync."""
if self._chat_history_reducer is None:
return []
seeded_messages: list[ChatMessageContent] = []
for message in self._chat_history_reducer.messages:
if message.role in (AuthorRole.SYSTEM, AuthorRole.DEVELOPER):
seeded_messages.append(message)
continue
break
return seeded_messages
async def _maybe_reduce_chat_history(self) -> None:
"""Reduce the chat history if a reducer is configured and the threshold is exceeded."""
if self._chat_history_reducer is not None:
history_messages = list(self._chat_history.messages)
reducer_messages = list(self._get_seeded_reducer_messages())
preserved_seeded_messages: list[ChatMessageContent] = []
for index, seeded_message in enumerate(reducer_messages):
if index < len(history_messages) and history_messages[index] == seeded_message:
continue
preserved_seeded_messages.append(seeded_message)
# Sync messages from the internal history to the reducer while preserving
# any seeded system/developer messages the reducer was initialized with.
self._chat_history_reducer.messages = [*preserved_seeded_messages, *history_messages]

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +14
| Filter | Decorator | Purpose |
|--------|-----------|---------|
| **Prompt Rendering** | `@kernel.filter(FilterTypes.PROMPT_RENDERING)` | Intercept before/after prompt is rendered |
| **Function Invocation** | `@kernel.filter(FilterTypes.FUNCTION_INVOCATION)` | Intercept before/after any function call |
| **Auto Function Invoke** | `@kernel.filter(FilterTypes.AUTO_FUNCTION_INVOCATION)` | Control automatic tool calls |

Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markdown table in “Filter Types” uses double pipes (||) at the start/end of rows, which renders incorrectly in many Markdown parsers (extra empty columns). Use standard single-pipe table syntax (e.g., | Filter | Decorator | Purpose |) for consistent rendering on GitHub.

Copilot uses AI. Check for mistakes.
Comment on lines +313 to +316
if self._chat_history_reducer is not None:
# Sync messages from the internal history to the reducer
self._chat_history_reducer.messages = self._chat_history.messages
result = await self._chat_history_reducer.reduce()
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_maybe_reduce_chat_history overwrites self._chat_history_reducer.messages with the manager’s internal history (self._chat_history.messages). This will discard any messages already present on the reducer instance (e.g., user-configured system/developer prompts added via add_system_message, or pre-seeded summaries), which changes reducer behavior and can make summarization prompts silently disappear. Consider seeding self._chat_history from chat_history_reducer.messages once in __init__ (deep-copy), and then keep both in sync without clobbering reducer state, or merge/preserve reducer-prefixed system/developer messages when syncing.

Copilot uses AI. Check for mistakes.
- Copy reducer pre-seeded messages into _chat_history at init (fixes microsoft#4)
- Use list() copy to avoid mutating reducer state
- Update docstring to clarify scope: manager-level reduce only
- AgentThread-level reduce is a separate enhancement
@lawcontinue
Copy link
Copy Markdown
Author

Response to Copilot review

Thanks for the thorough review. Addressing each point:

1. Reducer scope (manager vs agent thread) — Good catch. This PR reduces the manager's _chat_history (used by select_next_agent, should_terminate, filter_results) but not the per-agent AgentThread that feeds LLM calls. The manager history reduction keeps orchestration decisions focused on recent context. Fully solving #12303 at the LLM-call level requires reducer support in AgentThread, which @TaoChenOSU flagged as a separate planned direction. Updated the method docstring to make this scope explicit.

2. Tests — Will add in a follow-up commit. Agree the reducer invocation should be covered.

3. README markdown — Not part of this PR, skipping.

4. Reducer message overwrite — Fixed in the latest push. Now copies reducer's pre-seeded messages into _chat_history at init and uses list() to avoid mutating the reducer's own state during reduce calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation python Pull requests for the Python Semantic Kernel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: Bug: Complete chat history is getting passed to LLM - chat history summarizer is not working

3 participants