Skip to content

[BUG] EventLoopException with KeyError: 'output' when using tools with OpenTelemetry #1381

@IanLiYi1996

Description

@IanLiYi1996

Checks

  • I have updated to the lastest minor and patch version of Strands
  • I have checked the documentation and this is not expected behavior
  • I have searched ./issues and there are no duplicates of my issue

Strands Version

strands-agents[otel]>=1.15.0

Python Version

3.12.3

Operating System

Ubuntu 24.04 LTS (Linux 6.14.0-1016-aws)

Installation Method

pip

Steps to Reproduce

1. Create a tool that returns final responses

from strands import tool
from typing import Any

@tool
def create_summary(content: str = "", output: str = "") -> dict[str, Any]:
    """Tool for sub-agents to return final analysis results."""
    result = content or output or "Analysis complete"
    return {"output": result}

2. Configure sub-agent with the tool

tool_agents:
  - id: "analysis_agent"
    name: "Analysis Expert"
    tools:
      - "fetch_data"
      - "create_summary"  # ← Problematic tool
    system_prompt: |
      You must use create_summary tool for all final responses.

3. Implement sub-agent as PythonAgentTool

from strands import Agent
from strands.tools import PythonAgentTool

async def run(tool, streaming_queue, model, **kwargs):
    agent = Agent(model=model, system_prompt=prompt, tools=tools)
    agent_stream = agent.stream_async(query)

    # Extract response from toolResult
    async for event in agent_stream:
        if "toolResult" in content:
            # Process toolResult...
            response = extracted_text

    return {"toolUseId": tool_use_id, "status": "success", "content": [{"text": response}]}

subagent_tool = PythonAgentTool(tool_name="analysis_agent", tool_func=run)

4. Call supervisor with sub-agent

supervisor = Agent(model=model, tools=[subagent_tool])
async for event in supervisor.stream_async("Analyze data XYZ"):
    # Error occurs here
    pass

Expected Behavior

  1. Sub-agent calls create_summary with analysis results
  2. Tool returns {"output": "analysis text"}
  3. Strands sends toolResult to Bedrock via ConverseStream API
  4. Bedrock acknowledges and continues conversation flow
  5. OpenTelemetry records telemetry without errors
  6. Supervisor receives sub-agent's response
  7. Application displays results to user

Actual Behavior

Error Stack Trace:

strands.types.exceptions.EventLoopException: 'output'

Traceback (most recent call last):
  [... intermediate frames ...]
  File "strands/models/bedrock.py", line 686, in _stream
    for chunk in response["stream"]:
  File "opentelemetry/instrumentation/botocore/extensions/bedrock_utils.py", line 72, in __iter__
    self._process_event(event)
  File "opentelemetry/instrumentation/botocore/extensions/bedrock_utils.py", line 137, in _process_event
    self._stream_done_callback(self._response)
  File "opentelemetry/instrumentation/botocore/extensions/bedrock.py", line 642, in stream_done_callback
    self._converse_on_success(...)
  File "opentelemetry/instrumentation/botocore/extensions/bedrock.py", line 503, in _converse_on_success
    choice = _Choice.from_converse(result, capture_content)
  File "opentelemetry/instrumentation/botocore/extensions/bedrock_utils.py", line 493, in from_converse
    orig_message = response["output"]["message"]
                   ~~~~~~~~^^^^^^^^^^
KeyError: 'output'

Additional Context

No response

Possible Solution

The Mechanism

  1. OpenTelemetry auto-instrumentation (opentelemetry-instrument) globally intercepts ALL boto3 Bedrock API calls
  2. Strands must send toolResult back to Bedrock (required by Bedrock Converse API flow)
  3. OpenTelemetry response parser hardcodes access to response["output"]["message"] (line 493)
  4. toolResult acknowledgment responses have different structure than conversational responses
  5. No structure validation before accessing nested keys → KeyError

Why This is OpenTelemetry's Bug

The bedrock_utils.py parser makes incorrect assumptions:

# Current implementation (WRONG)
@staticmethod
def from_converse(response, capture_content):
    orig_message = response["output"]["message"]  # Assumes all responses have this structure
    # No validation, no error handling

It should be:

# Fixed implementation
@staticmethod
def from_converse(response, capture_content):
    # Validate structure before accessing
    if "output" not in response or "message" not in response.get("output", {}):
        # Non-conversational response (toolResult ack, stream chunk, error, etc.)
        return None  # Skip telemetry for these

    orig_message = response["output"]["message"]
    # Continue normal processing

Why Other Tools Don't Fail

Key Insight: This specific error manifests when:

  • Using tools designed for sub-agent final responses (create_summary, request_info)
  • Supervisor actively waits for the toolResult
  • Error occurs on critical path

Other tool results may trigger same OpenTelemetry parsing, but errors might be silently swallowed in async callbacks.

Additional Context

Comparison with Similar Patterns

Pattern that DOESN'T fail (simple text extraction):

# Extract text directly from assistant messages
async for event in agent_stream:
    if "text" in content:
        response += content["text"]  # No toolResult processing → No error

This avoids the issue because:

  • No intermediate tool for response wrapping
  • OpenTelemetry only sees standard message stream
  • No structure mismatch

Impact Assessment

Severity: High

  • Interrupts sub-agent execution
  • Discards valid responses
  • Affects all sub-agent patterns using response tools
  • Impacts production systems using Strands + OpenTelemetry

Affected Components:

  • opentelemetry-instrumentation-botocore
  • aws-opentelemetry-distro
  • Strands event loop with sub-agent toolResult processing

Related Issues

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions