See below for a list of tools that can be used for AI red teaming, organized by the phases defined in the GenAI Red Teaming Manual.
| AI Red Team Phase | Relevant Tools |
|---|---|
| Phase 1: Planning & Scoping | Python Risk Identification Tool for generative AI (PyRIT) |
| Generative AI Red-teaming and Assessment Kit (Garak) | |
| Promptfoo: LLM evals & red teaming | |
| MITRE ATLAS: Threat Mapping Matrix | |
| BlackIce | |
| EleutherAI | |
| CleverHans | |
| Adversarial Robustness Toolbox (ART) | |
| Giskard | |
| CyberSecEval | |
| Promptmap | |
| Fuzzyai | |
| Fickling | |
| Rigging | |
| judges | |
| Phase 2: Reconnaissance & Fingerprinting | Burp Suite / ZAP (note that AI systems often make use of things like server-side events: if you use a web proxy, make sure it has robust support for this technology or you will likely run into trouble) |
| whois | |
| cURL | |
| GitHub | |
| WireShark | |
| Model Repositories/Registries | |
| Google Dorks/Web Searches | |
| GitLeaks / TruffleHog / GitGuardian (Secret & Repo Scanning) | |
| Model Image runtime scanning | |
| LLMmap | |
| ModelScan: Scans model artifacts (Pickle, H5, SavedModel, PyTorch) for unsafe code / deserialization payloads | |
| Phase 3: Surface Mapping & Vulnerability | Traffic capture: Chrome DevTools (HAR), mitmproxy. |
| API specs and testing: Swagger, Postman. | |
| Diagramming and Threat Modeling: Mermaid, Draw.io, OWASP Threat Dragon, ThreatCanvas by SecureFlag, ThreatFinderAI. | |
| RAG and data inspection: Unstructured (Docling, Tika), vector DB console (Weaviate, Pinecone, Qdrant). | |
| ASCII Smuggler | |
| mcp-scan: Scans MCP servers for tool poisoning, tool-description prompt injection, and rug-pull tool redefinition | |
| Auth and secrets: GitLeaks, Vault, KMS CLI. | |
| Observability: OpenTelemetry, Grafana Loki, ELK Stack, Morpheus, Datadog. | |
| LLM recon/evals (optional): PyRIT, Promptfoo, Garak, Deepteam, Spikee | |
| Phase 4: Exploitation | CleverHans: Helps generating adversarial examples and attacks on AI models |
| Agent0: Containerized Agentic System based on Kali Linux. See the examples in GenAI-Red-Team-Lab/exploitation/agent0 | |
| AgentDojo: Framework for prompt-injection attacks (and defenses) against tool-using LLM agents | |
| llm-attacks (GCG): Generates universal, transferable adversarial suffixes against LLMs via Greedy Coordinate Gradient | |
| Phase 5: Persistence & Escalation | LLMGuard, Giskard, or Garak for continuous guardrail validation |
| LangSmith / LlamaIndex tracing for memory and state monitoring | |
| Pinecone Console, Weaviate Management UI, Chroma Vector DB Analyzer | |
| MLFlow and model registry logs for identifying drift or unauthorized retraining events | |
| API Gateway and IAM audit tools (AWS CloudTrail, Azure Monitor, GCP Cloud Audit Logs) | |
| LangFuse, AIExploit, PromptFoo | |
| FAISS, Annoy, HNSWlib | |
| Privacy inference frameworks (Privacy Meter, Copycat CNN, CleverHans) for long-term model inversion or data extraction detection | |
| Phase 6: Post-Exploitation & Impact | Tools may include those that emulate post‑exploitation behavior, or help observe and measure impact. Specific tools can depend on the organization’s preference or already existing security infrastructure. For example: Data discovery and data loss prevention (DLP) tools, Telemetry and SIEM logging |
| Phase 7: Evaluation & Reporting | Not Applicable |
| Phase 8: Post-Engagement & Remediation | Not Applicable |