MAINT: Adding simulated assistant role #1292

rlundeen2 · 2025-12-30T00:25:38Z

Adds a new simulated_assistant role to distinguish synthetic responses (prepended conversations, SeedPrompts) from actual target responses. Behaves identically to assistant for API calls but is preserved in memory.
Key Changes

MessagePiece/Message: Added api_role (maps simulated_assistant → assistant), is_simulated, and get_role_for_storage(). Deprecated .role getter.
Conversation Manager: Added mark_messages_as_simulated() helper; format_conversation_context() labels simulated as "Assistant (simulated)"
YAML Loading: SeedGroup.to_messages() converts assistant → simulated_assistant
Scoring: role_filter="assistant" now only scores real assistant responses (uses stored role, not api_role)

romanlutz · 2025-12-30T02:12:15Z

pyrit/models/message_piece.py

+        Role to use for API calls.
+
+        Maps simulated_assistant to assistant for API compatibility.
+        Use this property when sending messages to external APIs.


Doesn't this in part depend on the API itself? For example, OpenAI started using "developer" instead of "system" recently. But if you want to continue an existing conversation with a target that uses "system" then it should be possible to do so.

I don't think it makes sense for the target to have to decide what to do with simulated assistant response vs real assistant response. IMO it'd be easy for a target to forget to if role == simulated_assistant then role = assistant, which would likely be a bug.

I do think a target could format these differently from one another. But from the target's perspective, imo these should always be treated the same. To take your system/developer example, for us that's always "system" and a target can deserialize to whatever makes sense. It's the same with "assistant", but I don't think a target shouldn't ever treat simulated assistant differently than assistant.

romanlutz · 2025-12-30T02:13:25Z

pyrit/models/message.py

+        """
+        Check if this is a simulated assistant response.
+
+        Simulated responses come from prepended conversations or generated


What if we are just branching off of an existing conversation? Then it's not really simulated but actually happened...

So in this PR, the place we're setting conversations to simulated assistant responses are essentially when we're passing in prepended_conversations to attacks. Whether or not these were real responses that happened in the past, for the current attack conversation, it's not a conversation that took place, but rather user supplied.

rlundeen2 added 2 commits December 29, 2025 16:21

Adding simulated assistant role

dc63c6b

Adding deprecation version

abc27d7

romanlutz reviewed Dec 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MAINT: Adding simulated assistant role #1292

MAINT: Adding simulated assistant role #1292

rlundeen2 commented Dec 30, 2025

Uh oh!

romanlutz Dec 30, 2025

Uh oh!

rlundeen2 Dec 30, 2025

Uh oh!

romanlutz Dec 30, 2025

Uh oh!

rlundeen2 Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MAINT: Adding simulated assistant role #1292

Are you sure you want to change the base?

MAINT: Adding simulated assistant role #1292

Conversation

rlundeen2 commented Dec 30, 2025

Uh oh!

romanlutz Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

rlundeen2 Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

romanlutz Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

rlundeen2 Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants