-
Notifications
You must be signed in to change notification settings - Fork 465
Description
Problem Statement
Summary
Tool execution content (e.g., executed code) is stored in result.metrics.tool_metrics but does not appear in result.message, requiring users to implement custom extraction logic. Is this intentional?
Current Behavior
When using tools like Code Interpreter:
result = agent("Calculate 5 + 3")
# ❌ Users expect this to show the code:
print(result.message)
# Output: "I've calculated the result..." (LLM commentary only)
# ✅ Code is actually HERE:
print(result.metrics.tool_metrics['code_interpreter'].tool['input'])
# Output: Shows the actual Python code executedEvidence from actual logs:
'tool_metrics': {
'code_interpreter': ToolMetrics(
tool={
'toolUseId': 'tooluse_7rOk7xfMS864-IUMx6oliQ',
'name': 'code_interpreter',
'input': {
'code_interpreter_input': {
'action': {
'type': 'executeCode',
'language': 'python',
'code': '# Your dataset\ndata = [23, 45, 67, 89, 12, 34, 56]...'
}
}
}
}
)
}The code IS there - just not in result.message where users naturally look!
Impact
- Reduced transparency - Users can't see what code was executed without diving into metrics
- Poor debugging experience - Requires navigating nested data structures
- Security/audit concerns - Executed actions aren't visible in the primary response
- Requires boilerplate - Every user must implement custom extraction:
# Current workaround needed (15+ lines):
def format_response(result):
try:
tool_metrics = result.metrics.tool_metrics.get('code_interpreter')
code = tool_metrics.tool['input']['code_interpreter_input']['action']['code']
return f"Code: {code}\n\nResult: {str(result)}"
except (AttributeError, KeyError):
return str(result)Question
Is this the intended design? Should tool execution content:
Option A: Stay in metrics only (current behavior)
Option B: Also appear in result.message (expected behavior)
Option C: Be accessible via helper method like result.get_tool_inputs()
Expected Behavior
Users expect tool execution details to be visible in the primary response:
result = agent("Calculate 5 + 3")
print(result.message)
# Should show both:
# - Executed code
# - LLM commentaryEnvironment
- Strands SDK: Latest
- Tool:
strands_tools.code_interpreter.AgentCoreCodeInterpreter - Model: Claude Sonnet 4.5 (Bedrock)
- Use Case: AWS Bedrock AgentCore integration
Request
Could the Strands team clarify:
- Is tool content intentionally separated from
result.message? - What's the recommended pattern for accessing tool execution details?
- Would you consider adding helper methods or documentation for this?
This affects user experience, especially for educational content, debugging, and audit trails.
Proposed Solution
Automatically append tool execution info to result.message content blocks:
# In event_loop_cycle or _execute_event_loop_cycle:
if tool_result:
message['content'].append({
"text": f"[Executed: {tool_name}]\nInput: {tool_input}\nOutput: {tool_output}"
})Pros:
- Natural user experience
- Consistent with message-based paradigm
- No API changes needed
Cons:
- Increases message token count
- May clutter conversation history
Use Case
Educational/Getting Started Experience:
- Users learning with Code Interpreter expect to see what code was executed
- Currently they only see LLM commentary: "I've calculated the result..."
- The actual executed code is hidden in nested metrics structure
Production Debugging:
- Developers need quick visibility into what actions the agent took
- Audit trails require clear records of executed commands
- Security reviews need to verify what code actually ran
Example:
result = agent("Calculate average of [23, 45, 67, 89, 12]")
print(result.message) # ❌ Only shows: "The average is 47.2"
# ✅ Should also show: The Python code that calculated itAlternatives Solutions
Add convenience methods to extract tool data:
class AgentResult:
def get_tool_executions(self) -> list[dict]:
"""Extract all tool execution details."""
return [
{
'name': tool_name,
'input': metrics.tool['input'],
'output': metrics.tool_result
}
for tool_name, metrics in self.metrics.tool_metrics.items()
]
def get_executed_code(self) -> str | None:
"""Convenience method for code_interpreter."""
ci = self.metrics.tool_metrics.get('code_interpreter')
if ci:
return ci.tool['input']['code_interpreter_input']['action'].get('code')Pros:
- Clean API
- Doesn't modify message history
- Easy to document
Cons:
- Still requires extra step
- Not as discoverable
Option 2: Configuration Flag
Let users choose:
agent = Agent(
model=MODEL_ID,
tools=[code_interpreter],
include_tool_details_in_messages=True # Default: False
)Pros:
- Backward compatible
- User control
Cons:
- More configuration complexity
- Split behavior patterns
Additional Context
Current Workaround (Required by All Users):
def format_response(result):
"""Extract code from metrics - 15 lines of boilerplate"""
try:
tool_metrics = result.metrics.tool_metrics.get('code_interpreter')
if tool_metrics and hasattr(tool_metrics, 'tool'):
action = tool_metrics.tool['input']['code_interpreter_input']['action']
if 'code' in action:
code = action['code']
return f"Executed:\n```python\n{code}\n```\n\nResult: {str(result)}"
except (AttributeError, KeyError):
pass
return str(result)Impact:
- Every user must implement this extraction
- Not documented in getting started guides
- Reduces transparency for security/audit
- Poor debugging experience
Related Code References:
agent_result.py-AgentResultstructureagent.py-_execute_event_loop_cycle()- where tool results are processedagent.py-_record_tool_execution()- shows message recording pattern for direct tool calls
Environment:
- Strands SDK: Latest (sdk-python)
- Tool:
strands_tools.code_interpreter.AgentCoreCodeInterpreter - Model: Claude Sonnet 4.5 via Bedrock
- Use Case: AWS Bedrock AgentCore integration