Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions evaluators/contrib/atr/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
.PHONY: help test lint lint-fix typecheck build

help:
@echo "Agent Control Evaluator - ATR Threat Rules - Makefile commands"
@echo " make test - run pytest"
@echo " make lint - run ruff check"
@echo " make lint-fix - run ruff check --fix"
@echo " make typecheck - run mypy"
@echo " make build - build package"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is adding a new supported contrib package, I think we also need to wire it into the repo-level checks and release path. Right now the root test-extras target only runs Galileo, and the release/build scripts only publish the Galileo contrib evaluator. If ATR should be maintained here, it should have root atr-* targets, be included in contrib CI coverage, and be added to build/release/version metadata.

test:
uv run --with pytest --with pytest-asyncio --with pytest-cov pytest tests --cov=src --cov-report=xml:../../../coverage-evaluators-atr.xml -q

lint:
uv run --with ruff ruff check --config ../../../pyproject.toml src/

lint-fix:
uv run --with ruff ruff check --config ../../../pyproject.toml --fix src/

typecheck:
uv run --with mypy mypy --config-file ../../../pyproject.toml src/

build:
uv build
47 changes: 47 additions & 0 deletions evaluators/contrib/atr/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# ATR Threat Rules Evaluator for Agent Control

Regex-based AI agent threat detection using [ATR (Agent Threat Rules)](https://agentthreatrule.org) community rules.

## Features

- 20 bundled rules covering OWASP Agentic Top 10 categories
- Pure regex detection -- no API keys, no external calls
- Sub-5ms evaluation time
- Configurable severity threshold and category filtering
- Auto-discovered via Python entry points

## Categories

| Category | Rules | Description |
|----------|-------|-------------|
| prompt-injection | 5 | Direct, indirect, jailbreak, system override, multi-turn |
| agent-manipulation | 2 | Cross-agent attacks, goal hijacking |
| context-exfiltration | 2 | Data exfil via tools, context window leaks |
| privilege-escalation | 2 | Unauthorized escalation, role assumption |
| tool-poisoning | 5 | Tool definition poisoning, hidden instructions, credentials, reverse shell |
| skill-compromise | 1 | Malicious skill installation |
| excessive-autonomy | 2 | Unauthorized actions, safety bypass |
| data-poisoning | 1 | Training data poisoning |

## Configuration

```python
from agent_control_evaluator_atr.threat_rules import ATRConfig

config = ATRConfig(
min_severity="medium", # "low", "medium", "high", "critical"
block_on_match=True, # matched=True when threat detected
categories=[], # empty = all categories
on_error="allow", # "allow" (fail-open) or "deny" (fail-closed)
)
```

## Installation

```bash
uv pip install -e evaluators/contrib/atr
```

## License

Apache-2.0. ATR rules are MIT-licensed.
42 changes: 42 additions & 0 deletions evaluators/contrib/atr/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
[project]
name = "agent-control-evaluator-atr"
version = "0.2.0"
description = "ATR (Agent Threat Rules) evaluator for agent-control"
readme = "README.md"
requires-python = ">=3.12"
license = { text = "Apache-2.0" }
authors = [{ name = "ATR Community" }]
dependencies = [
"agent-control-evaluators>=3.0.0",
"agent-control-models>=3.0.0",
]

[project.optional-dependencies]
dev = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"pytest-cov>=4.0.0",
"ruff>=0.1.0",
"mypy>=1.8.0",
]

[project.entry-points."agent_control.evaluators"]
"atr.threat_rules" = "agent_control_evaluator_atr.threat_rules:ATREvaluator"

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src/agent_control_evaluator_atr"]

[tool.ruff]
line-length = 100
target-version = "py312"

[tool.ruff.lint]
select = ["E", "F", "I"]

[tool.uv.sources]
agent-control-evaluators = { path = "../../builtin", editable = true }
agent-control-models = { path = "../../../models", editable = true }
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__all__: list[str] = []
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
from .config import ATRConfig
from .evaluator import ATREvaluator
from .models import ATR_FIELDS, ATRCondition, ATREvent, ATRRule, RuleMatch
from .redact import redact_matched_value, redact_matched_values

__all__ = [
"ATREvaluator",
"ATRConfig",
"ATREvent",
"ATRRule",
"ATRCondition",
"RuleMatch",
"ATR_FIELDS",
"redact_matched_value",
"redact_matched_values",
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
from __future__ import annotations

from typing import Literal

from agent_control_evaluators import EvaluatorConfig
from pydantic import Field


class ATRConfig(EvaluatorConfig):
"""Configuration for ATR (Agent Threat Rules) evaluator.

Attributes:
min_severity: Minimum severity level to match ("low", "medium", "high", "critical").
block_on_match: Whether to set matched=True when a threat is detected.
categories: Category filter; empty list means all categories.
on_error: Error policy ("allow" = fail-open, "deny" = fail-closed).
condition_budget_ms: Wall-clock budget for each regex condition evaluation,
in milliseconds. Patterns exceeding this budget are skipped with a
warning rather than blocking the evaluator pipeline. Default 50 ms
is generous for any reasonable pattern; the budget only fires on
catastrophic backtracking.
"""

min_severity: Literal["low", "medium", "high", "critical"] = "medium"
block_on_match: bool = True
categories: list[str] = Field(default_factory=list)
on_error: Literal["allow", "deny"] = "allow"
condition_budget_ms: int = Field(default=50, ge=1, le=10_000)
Loading