Skip to content

FEAT: Benchmark Scenario#1662

Open
ValbuenaVC wants to merge 30 commits intomicrosoft:mainfrom
ValbuenaVC:benchmark
Open

FEAT: Benchmark Scenario#1662
ValbuenaVC wants to merge 30 commits intomicrosoft:mainfrom
ValbuenaVC:benchmark

Conversation

@ValbuenaVC
Copy link
Copy Markdown
Contributor

@ValbuenaVC ValbuenaVC commented Apr 27, 2026

Description

Adds a benchmarking scenario (AdversarialBenchmark) to PyRIT to compare the performance between adversarial targets. Measurement of success uses attack success rate (ASR). The benchmarking scenario is unique in that it takes models as a runtime argument, so we override _get_atomic_attacks_async to patch in live adversarial targets while using the AttackTechniqueRegistry to build AdversarialBenchmarkStrategy. The scenario takes list[PromptTarget] in its constructor.

Includes minor changes to SCENARIO_TECHNIQUES (including adding another attack technique) and a small change to test_rapid_response.py because of said change. Also adds a light aggregate for faster attacks that are benchmark friendly.

Also includes limited parameter support from #1680.

Tests and Documentation

Added tests/unit/scenario/test_benchmark.py.
Updated tests/unit/scenario/test_rapid_response.py.

Victor Valbuena added 2 commits April 23, 2026 17:33
@ValbuenaVC ValbuenaVC changed the title Benchmark [DRAFT] FEAT: Benchmark Scenario Apr 27, 2026
Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated
Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated
Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated
Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated
@ValbuenaVC ValbuenaVC marked this pull request as ready for review May 1, 2026 23:40
@ValbuenaVC ValbuenaVC changed the title [DRAFT] FEAT: Benchmark Scenario FEAT: Benchmark Scenario May 1, 2026
Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated
Comment thread pyrit/scenario/scenarios/benchmark/adversarial.py
@rlundeen2 rlundeen2 self-assigned this May 4, 2026
Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated
Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated
Comment thread pyrit/scenario/scenarios/benchmark/adversarial.py
Comment thread pyrit/scenario/scenarios/benchmark/adversarial.py
Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated
Comment thread pyrit/scenario/scenarios/benchmark/adversarial.py
Comment thread pyrit/scenario/scenarios/benchmark/adversarial.py
Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated
Comment thread pyrit/scenario/scenarios/benchmark/adversarial.py
Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated
Victor Valbuena added 2 commits May 5, 2026 17:16
@ValbuenaVC ValbuenaVC enabled auto-merge May 6, 2026 16:48
Comment thread doc/scanner/0_scanner.md Outdated
Comment on lines +122 to +129
adversarial_models: Either a ``dict`` mapping user-chosen labels to
``PromptChatTarget`` instances, or a ``list`` of targets (labels
inferred from each target's identifier). When a list is given,
identical targets are silently deduped and distinct targets
whose inferred names collide are suffixed (``_2``, ``_3``, …)
with a warning. Each target is wrapped in a default
``AttackAdversarialConfig`` before being injected into each
technique.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to support two types of inputs? Why not just enforce list OR dict? This adds complexity in the constructor

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand. Right now the constructor takes adversarial_models: dict[str, PromptChatTarget] | list[PromptChatTarget] which is the list OR dict point you made. The reason for accepting either is that assigning a name to a target is non-trivial, so user-provided tags are convenient

Copy link
Copy Markdown
Contributor

@rlundeen2 rlundeen2 May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we normalize on the underlying model name? E.g. how we use it in evaluation identifiers is

target._underlying_model or target._model_name

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the current implementation for non-user provided labels (line 265 of adversarial.py). But if we want to, we can just remove user labeling as a feature and fall back on that normalization by default. What do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to not really understanding why we have two parameter types (but if we do i think we need to display the list version in the notebook). I think i'd prefer a list just bc it's simpler

Comment on lines +16 to +18
# %%
# %load_ext autoreload
# %autoreload 2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this ?

Comment thread doc/scanner/0_scanner.md
| Family | Scenarios | Documentation |
|--------|-----------|---------------|
| **AIRT** | ContentHarms, Psychosocial, Cyber, Jailbreak, Leakage, Scam | [AIRT Scenarios](airt.ipynb) |
| **AIRT** | ContentHarms, Psychosocial, Cyber, Jailbreak, Leakage, Scam, AdversarialBenchmark | [AIRT Scenarios](airt.ipynb) |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shouldn't we have a separate benchmark family ? just thinking we're putting it in a separate folder and can see having more than one benchmarking scenario

"""
return [
Parameter(
name="include_default_baseline",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be something that's exposed in the scenario base class since it's not specific to this scenario ?

if "" in adversarial_models:
raise ValueError(f"Empty user-chosen label passed to adversarial_models! Got `{adversarial_models}`.")

# Stage B: wrap each bare target in a default AttackAdversarialConfig.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: i think numbers for the stages is more intuitive (ie Stage 1 instead of Stage A)

# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread doc/scanner/adversarial.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants