Skip to content

[FEATURE] Add steer_after_model support to LLMSteeringHandler #1711

@morganwillisaws

Description

@morganwillisaws

Problem Statement

The current LLMSteeringHandler only implements steer_before_tool. There's no generic LLM-based implementation for steer_after_model, which means anyone who wants to evaluate model output (instruction adherence, voice/tone enforcement, output format validation, etc.) has to write a custom SteeringHandler subclass from scratch.

Proposed Solution

Extend LLMSteeringHandler to support both steer_before_tool and steer_after_model, with users opting in to each by providing the corresponding prompt mapper.

New classes in mappers.py:

  • LLMModelPromptMapper — protocol for model output evaluation prompts
  • DefaultModelPromptMapper — default implementation using Agent SOP format

Updated LLMSteeringHandler:

  • Accepts an optional model_prompt_mapper parameter
  • If provided, steer_after_model creates a fresh steering agent, passes it the agent's system prompt + model output, and gets a structured proceed/guide decision
  • If not provided, steer_after_model returns Proceed (no-op, backward compatible)

Key design decisions:

  • The DefaultModelPromptMapper automatically pulls the agent's system prompt via getattr(agent, "system_prompt", None) and includes it in the evaluation. This makes it a generic "did you follow your instructions?" check without the user needing to duplicate instructions.
  • Separate _LLMModelSteering structured output model that only allows proceed/guide (no interrupt, since the model has already responded).
  • Dedicated _MODEL_STEERING_PROMPT_TEMPLATE in Agent SOP format, parallel to the existing tool template but tailored for output evaluation.
  • Fresh steering agent created per evaluation to avoid conversation history bleed.

Use Case

This is a common need. Any agent running a long conversation or multi-step workflow will drift from its system prompt over time. The model's attention to original instructions degrades as context grows. This affects:

  • Writing agents that need to maintain a consistent voice
  • Coding agents that must follow formatting or architecture rules
  • Data pipeline agents restricted to certain operations
  • Customer service agents with tone and escalation policies
  • Any agent with a system prompt that contains explicit constraints

Alternatives Solutions

No response

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions