Text accumulation issue for output_key

## 🔴 Required Information

**Describe the Bug:**

When using `LlmAgent` with `output_key` parameter and `StreamingMode.SSE`, text streamed before tool calls is lost. Only text that streams after the final tool execution is saved to the output_key state. This results in 60-70% of agent responses being discarded in production scenarios where agents make tool calls.

**Steps to Reproduce:**

1. Install `pip install google-adk==1.32.0`
2. Run the reproduction code below
3. Observe that only text after the final tool call is saved

**Minimal Reproduction Code:**

```python
#!/usr/bin/env python3
"""
Minimal reproduction of ADK text accumulation bug.
When an agent streams text before tool calls, that text is lost from output_key.
"""

import asyncio
import json
import sys
from pathlib import Path

from dotenv import load_dotenv
load_dotenv()

from google.adk.agents import LlmAgent, RunConfig
from google.adk.agents.run_config import StreamingMode
from google.adk.sessions.in_memory_session_service import InMemorySessionService
from google.adk.runners.in_memory_runner import InMemoryRunner
from google.adk.tools.agent_tool import AgentTool

MODEL = "gemini-2.5-pro"


def create_chart_generator_agent() -> LlmAgent:
    """Create a simple agent that generates chart specifications."""
    return LlmAgent(
        model=MODEL,
        name="chart_generator_agent",
        instruction=(
            "Generate a simple Vega-Lite chart specification. "
            "Create a JSON with 5 random data points for a bar chart."
        ),
        output_key="chart_result",
        response_mime_type="application/json",
    )


async def test_text_accumulation():
    """Test that text before tool calls is preserved in output_key."""
    
    # Create chart generator and wrap as tool
    chart_generator_agent = create_chart_generator_agent()
    chart_tool = AgentTool(chart_generator_agent)

    # Create main agent with output_key (triggers the bug)
    agent = LlmAgent(
        model=MODEL,
        name="chart_test_agent",
        instruction=(
            "You will create 2 chart specifications. Follow this EXACT flow:\n"
            "1. Write 2 sentences explaining what you'll do (introduction)\n"
            "2. Call chart_generator_agent tool\n"
            "3. Write 1 sentence about progress\n"
            "4. Call chart_generator_agent tool again\n"
            "5. Write 2 sentences summarizing (conclusion)\n"
            "Keep each text section distinct and clear."
        ),
        tools=[chart_tool],
        output_key="final_output",  # BUG: Only saves text after final tool call
    )

    # Setup runner with streaming
    session_service = InMemorySessionService()
    runner = InMemoryRunner(agent=agent, session_service=session_service)
    
    session = await session_service.create_session(
        app_name="test_app",
        user_id="test_user"
    )

    # Track all text parts as they stream
    accumulated_text_parts = []
    tool_calls = []
    
    user_message = "Create 2 chart specifications with random data"

    print("Testing Text Accumulation Across Tool Calls")
    print("=" * 80)
    print()

    # Run with streaming enabled
    async for event in runner.run_async(
        user_id="test_user",
        session_id=session.id,
        new_message=user_message,
        run_config=RunConfig(streaming_mode=StreamingMode.SSE),
    ):
        for candidate in getattr(event, 'candidates', []):
            for part in getattr(candidate.content, 'parts', []):
                # Skip thoughts
                if getattr(part, 'thought', False):
                    continue
                
                # Collect text parts
                if hasattr(part, 'text') and part.text:
                    accumulated_text_parts.append(part.text)
                    print(f"  Text: {part.text[:80]}...")
                
                # Track tool calls
                if hasattr(part, 'function_call') and part.function_call:
                    tool_calls.append(part.function_call.name)
                    print(f"  Tool: {part.function_call.name}")

    print()
    print("=" * 80)

    # Get the final state
    final_session = await session_service.get_session(
        app_name="test_app", user_id="test_user", session_id=session.id
    )
    final_output = final_session.state.get("final_output", "")

    # Combine all text parts (what we expect)
    expected_combined = "".join(accumulated_text_parts)

    # Show results
    print(f"Expected: {len(expected_combined)} chars ({len(accumulated_text_parts)} text parts)")
    print(f"Actual:   {len(final_output)} chars (from output_key)")
    
    if expected_combined:
        match_ratio = len(final_output) / len(expected_combined)
        print(f"Match:    {match_ratio:.1%}")
        print()
        
        if match_ratio < 0.3:
            print("❌ FAIL: Text before tool calls was lost")
            return False
        elif match_ratio >= 0.8:
            print("✅ PASS: Text accumulation working")
            return True
        else:
            print(f"⚠️  PARTIAL: Lost ~{100 - match_ratio * 100:.0f}%")
            return False
    else:
        print("❌ FAIL: No text accumulated")
        return False


async def main():
    """Run the test."""
    print("ADK Text Accumulation Bug Reproduction")
    print(f"Package: google-adk==1.32.0")
    print(f"Model:   {MODEL}")
    print()

    await test_text_accumulation()


if __name__ == "__main__":
    asyncio.run(main())
```

Expected: All text parts (intro + progress + conclusion) saved to output_key.
Actual: Only conclusion text (after final tool call) is saved.

**Expected Behavior:**

All streamed text parts (both before and after tool calls) should be accumulated and saved to the `output_key` state parameter. The final output should contain:
- Introduction text (before tool calls)
- Progress updates (between tool calls)  
- Conclusion text (after tool calls)

**Observed Behavior:**

Only text streamed after the **final** tool call is saved to `output_key`. All text before tool executions is discarded.

Example from test output:
```
Expected: 662 chars (6 text parts)
Actual:   231 chars (from output_key)
Match:    34.9%

⚠️  PARTIAL: Lost ~65%
```

The agent streams:
1. Intro text: "I will create a chart..." → LOST
2. Tool call: chart_generator_agent
3. Progress text: "I will create a chart..." → LOST  
4. Tool call: chart_generator_agent
5. Conclusion: "The chart has been generated..." → SAVED

Only part 5 appears in the final output.

**Environment Details:**

- **ADK Library Version**: 1.32.0 (`pip show google-adk`)
- **Desktop OS**: macOS (also reproduced on Linux)
- **Python Version**: Python 3.13.7 (`python -V`)

**Model Information:**

- **Are you using LiteLLM**: No
- **Which model is being used**: gemini-2.5-pro (also affects flash)

---

## 🟡 Optional Information

**Regression:**

All previous versions, but appears to be a long-standing issue in the `__maybe_save_output_to_state()` method of `LlmAgent` which only processes `is_final_response()` events.

**Logs:**

Test output showing the bug:
```text
Testing Text Accumulation Across Tool Calls
================================================================================

  Text: I will create a chart
  Text:  for you with random data. Let me use the `chart_generator_agent` to build it.

  Tool: chart_generator_agent
  Text: I will create a chart for you with random data. Let me use the `chart_generator_...
  Tool: chart_generator_agent
  Text: The chart specification has been generated successfully. It includes random data...
  Text:  for a bar chart.

All chart specifications have been generated successfully. Yo...
  Text: The chart specification has been generated successfully. It includes random data...

================================================================================
Expected: 662 chars (6 text parts)
Actual:   231 chars (from output_key)
Match:    34.9%

⚠️  PARTIAL: Lost ~65%
```

**Minimal Reproduction Code:**

The full reproduction code is provided in the "Steps to Reproduce" section above. The test creates an agent that:
1. Streams introduction text
2. Calls a tool (chart_generator_agent)
3. Streams progress text
4. Calls the tool again
5. Streams conclusion text

Only step 5 (conclusion) is saved to `output_key`, losing steps 1 and 3 (60-70% of content).

**How often has this issue occurred?:**

- **Always (100%)** - Reproducible on every run with the provided test case

**Additional Context:**

This bug severely impacts production usage where agents use tools (AgentTool or FunctionTool). Users lose critical context:
- Explanations of what the agent is doing
- Reasoning before tool calls
- Progress updates during multi-step operations

The root cause appears to be in `google/adk/agents/llm_agent.py` in the `__maybe_save_output_to_state()` method, which only saves output when `is_final_response()` returns True. This happens only after all tool executions complete, so earlier text is never persisted.

**Impact:**
- Severe: 60-80% of agent responses lost in streaming scenarios with tools
- Production agents appear broken to users (missing explanations)
- Affects all LlmAgent instances using output_key + streaming + tools

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text accumulation issue for output_key #5590

🔴 Required Information

🟡 Optional Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Text accumulation issue for output_key #5590

Description

🔴 Required Information

🟡 Optional Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions