NVIDIA · ankushchadha · Jun 15, 2026
diff --git a/README.md b/README.md
@@ -18,7 +18,7 @@ SkillSpector helps you answer: **"Is this skill safe to install?"**
 ## Features
 
 - **Multi-format input**: Scan Git repos, URLs, zip files, directories, or single files
-- **64 vulnerability patterns** across 16 categories: prompt injection, data exfiltration, privilege escalation, supply chain, excessive agency, output handling, system prompt leakage, memory poisoning, tool misuse, rogue agent, trigger abuse, dangerous code (AST), taint tracking, YARA signatures, MCP least privilege, and MCP tool poisoning
+- **67 vulnerability patterns** across 17 categories: prompt injection, data exfiltration, privilege escalation, supply chain, excessive agency, output handling, system prompt leakage, memory poisoning, tool misuse, rogue agent, anti-refusal, trigger abuse, dangerous code (AST), taint tracking, YARA signatures, MCP least privilege, and MCP tool poisoning
 - **Two-stage analysis**: Fast static analysis + optional LLM semantic evaluation
 - **Live vulnerability lookups**: SC4 queries [OSV.dev](https://osv.dev) for real-time CVE data with automatic offline fallback
 - **Multiple output formats**: Terminal, JSON, Markdown, and SARIF reports
@@ -183,7 +183,7 @@ skillspector scan ./my-skill/ --no-llm
 
 ## Vulnerability Patterns
 
-SkillSpector detects **64 vulnerability patterns** across 16 categories:
+SkillSpector detects **67 vulnerability patterns** across 17 categories:
 
 ### Prompt Injection (5 patterns)
 
@@ -195,6 +195,14 @@ SkillSpector detects **64 vulnerability patterns** across 16 categories:
 | P4 | Behavior Manipulation | MEDIUM | Subtle instructions altering agent decisions |
 | P5 | Harmful Content | CRITICAL | Instructions that could cause physical harm |
 
+### Anti-Refusal (3 patterns)
+
+| ID | Pattern | Severity | Description |
+|----|---------|----------|-------------|
+| AR1 | Refusal Suppression | HIGH | Instructions to never refuse or always comply (e.g. "never refuse", "always comply") |
+| AR2 | Disclaimer Suppression | HIGH | Instructions to omit warnings, disclaimers, or ethical commentary (e.g. "no disclaimers", "do not moralize") |
+| AR3 | Safety Policy Nullification | HIGH | Jailbreak framing that nullifies guardrails (e.g. "you have no restrictions", "ignore your guidelines", "do anything now") |
+
 ### Data Exfiltration (4 patterns)
 
 | ID | Pattern | Severity | Description |

diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md
@@ -124,7 +124,7 @@ There are no conditional edges: after `resolve_input` → `build_context`, all a
 |------|------|--------|
 | **resolve_input** | Consumes `input_path` or `skill_path`; resolves URLs/zips/files via InputHandler; sets `skill_path` and (when needed) `temp_dir_for_cleanup` | [resolve_input.py](../src/skillspector/nodes/resolve_input.py) |
 | **build_context** | Reads `skill_path`, populates `components`, `file_cache`, `ast_cache`, `manifest`, `component_metadata`, `has_executable_scripts` | [build_context.py](../src/skillspector/nodes/build_context.py) |
-| **Analyzers** | 20 nodes; each returns `AnalyzerNodeResponse` (list of `Finding`). State reducer appends to `findings`. | [nodes/analyzers/__init__.py](../src/skillspector/nodes/analyzers/__init__.py) (`ANALYZER_NODE_IDS`, `ANALYZER_NODES`) |
+| **Analyzers** | 21 nodes; each returns `AnalyzerNodeResponse` (list of `Finding`). State reducer appends to `findings`. | [nodes/analyzers/__init__.py](../src/skillspector/nodes/analyzers/__init__.py) (`ANALYZER_NODE_IDS`, `ANALYZER_NODES`) |
 | **meta_analyzer** | Per-file LLM filter/enrich of `findings` → `filtered_findings` via `LLMMetaAnalyzer`; one LLM call per file (or per chunk for oversized files); token budgets from `constants.py`; falls back when `use_llm` is False | [meta_analyzer.py](../src/skillspector/nodes/meta_analyzer.py), [llm_analyzer_base.py](../src/skillspector/nodes/llm_analyzer_base.py) |
 | **report** | Builds SARIF 2.1.0, computes `risk_score`, `risk_severity`, `risk_recommendation`; writes `report_body` from `output_format` (terminal/json/markdown/sarif) | [report.py](../src/skillspector/nodes/report.py) |
 
@@ -156,7 +156,7 @@ There are no conditional edges: after `resolve_input` → `build_context`, all a
 | `pattern_defaults.py` | Shared pattern metadata (category, explanation, remediation) |
 | `static_yara.py` | YARA-based static analyzer |
 | `osv_client.py` | OSV.dev API client for live vulnerability lookups (SC4); batch queries with caching and fallback |
-| `static_patterns_*.py` | 11 pattern-based analyzers (prompt_injection, data_exfiltration, etc.) |
+| `static_patterns_*.py` | 12 pattern-based analyzers (prompt_injection, data_exfiltration, anti_refusal, etc.) |
 | `behavioral_ast.py` | AST-based behavioral analyzer (AST1–AST8): detects exec, eval, subprocess, os.system, compile, dynamic import/getattr, and dangerous execution chains |
 | `behavioral_taint_tracking.py` | Taint-tracking behavioral analyzer (stub) |
 | `mcp_least_privilege.py`, `mcp_tool_poisoning.py`, `mcp_rug_pull.py` | MCP analyzer stubs |

diff --git a/src/skillspector/nodes/analyzers/__init__.py b/src/skillspector/nodes/analyzers/__init__.py
@@ -33,6 +33,9 @@
 from skillspector.nodes.analyzers.semantic_security_discovery import (
     node as semantic_security_discovery_node,
 )
+from skillspector.nodes.analyzers.static_patterns_anti_refusal import (
+    node as static_patterns_anti_refusal_node,
+)
 from skillspector.nodes.analyzers.static_patterns_data_exfiltration import (
     node as static_patterns_data_exfiltration_node,
 )
@@ -80,6 +83,7 @@
     "static_patterns_memory_poisoning",
     "static_patterns_tool_misuse",
     "static_patterns_rogue_agent",
+    "static_patterns_anti_refusal",
     "static_yara",
     "behavioral_ast",
     "behavioral_taint_tracking",
@@ -103,6 +107,7 @@
     "static_patterns_memory_poisoning": static_patterns_memory_poisoning_node,
     "static_patterns_tool_misuse": static_patterns_tool_misuse_node,
     "static_patterns_rogue_agent": static_patterns_rogue_agent_node,
+    "static_patterns_anti_refusal": static_patterns_anti_refusal_node,
     "static_yara": static_yara_node,
     "behavioral_ast": behavioral_ast_node,
     "behavioral_taint_tracking": behavioral_taint_tracking_node,

diff --git a/src/skillspector/nodes/analyzers/pattern_defaults.py b/src/skillspector/nodes/analyzers/pattern_defaults.py
@@ -38,6 +38,7 @@ class PatternCategory(StrEnum):
     YARA_MATCH = "YARA Match"
     MCP_LEAST_PRIVILEGE = "MCP Least Privilege"
     MCP_TOOL_POISONING = "MCP Tool Poisoning"
+    ANTI_REFUSAL = "Anti-Refusal"
 
 
 # Pattern-specific explanations (why the finding is dangerous)
@@ -119,6 +120,10 @@ class PatternCategory(StrEnum):
     "TP2": "Unicode deception detected in skill identifiers or descriptions. Homoglyphs, RTL overrides, or invisible characters can make malicious content appear benign.",
     "TP3": "Instruction injection patterns found in parameter descriptions or default values. Parameter metadata is read by LLMs and can override intended behavior.",
     "TP4": "Skill description does not match actual code behavior. The declared purpose diverges from what the code actually does, indicating possible deception.",
+    # Anti-Refusal Statements (jailbreak)
+    "AR1": "Skill instructs the agent to never refuse or to always comply. Suppressing the agent's ability to decline removes a core safety control and enables downstream harmful requests to succeed.",
+    "AR2": "Skill instructs the agent to omit warnings, disclaimers, or ethical commentary. Stripping safety caveats hides risk from the user and is a common jailbreak preamble.",
+    "AR3": "Skill attempts to nullify the agent's safety policies or restrictions ('you have no restrictions', 'ignore your guidelines', 'do anything now'). This is a direct jailbreak that disables guardrails.",
 }
 
 # Rule ID -> category (for report output)
@@ -182,6 +187,10 @@ class PatternCategory(StrEnum):
     "TP2": PatternCategory.MCP_TOOL_POISONING.value,
     "TP3": PatternCategory.MCP_TOOL_POISONING.value,
     "TP4": PatternCategory.MCP_TOOL_POISONING.value,
+    # Anti-Refusal Statements (jailbreak)
+    "AR1": PatternCategory.ANTI_REFUSAL.value,
+    "AR2": PatternCategory.ANTI_REFUSAL.value,
+    "AR3": PatternCategory.ANTI_REFUSAL.value,
 }
 
 # Rule ID -> pattern display name (for report output)
@@ -245,6 +254,10 @@ class PatternCategory(StrEnum):
     "TP2": "Unicode Deception",
     "TP3": "Parameter Description Injection",
     "TP4": "Description-Behavior Mismatch",
+    # Anti-Refusal Statements (jailbreak)
+    "AR1": "Refusal Suppression",
+    "AR2": "Disclaimer Suppression",
+    "AR3": "Safety Policy Nullification",
 }
 
 # Pattern-specific remediations (how to fix the issue)
@@ -326,6 +339,10 @@ class PatternCategory(StrEnum):
     "TP2": "Replace non-ASCII characters in identifiers with ASCII equivalents. Remove RTL override and invisible formatting characters.",
     "TP3": "Remove injection patterns, system tokens, and suspicious content from parameter descriptions and default values.",
     "TP4": "Update the skill description to accurately reflect all capabilities, or remove undeclared functionality.",
+    # Anti-Refusal Statements (jailbreak)
+    "AR1": "Remove any instruction telling the agent to never refuse or always comply. The agent must retain the ability to decline unsafe, out-of-scope, or harmful requests.",
+    "AR2": "Remove instructions that suppress warnings, disclaimers, or ethical commentary. Let the agent surface safety-relevant caveats to the user.",
+    "AR3": "Remove jailbreak framing that nullifies safety policies or restrictions. Skill content must not instruct the agent to ignore its guidelines or operate without guardrails.",
 }
 
 

diff --git a/src/skillspector/nodes/analyzers/static_patterns_anti_refusal.py b/src/skillspector/nodes/analyzers/static_patterns_anti_refusal.py
@@ -0,0 +1,172 @@
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Static patterns: anti-refusal statements (AR1-AR3). Node and analyze() in one module.
+
+Anti-refusal statements are instructions embedded in a skill that try to suppress the
+agent's ability to refuse, hedge, or apply safety constraints. They are a jailbreak
+technique distinct from generic instruction-override: rather than injecting a new task,
+they neutralize the model's *refusal behavior* so subsequent harmful requests succeed.
+
+Three families are detected:
+  AR1 - Refusal suppression: "never refuse", "always comply", "you cannot say no".
+  AR2 - Disclaimer/caveat suppression: "no warnings", "do not moralize", "skip the ethics".
+  AR3 - Safety/policy nullification: "you have no restrictions", "ignore your guidelines",
+        "not bound by any policy", "do anything now".
+"""
+
+from __future__ import annotations
+
+import re
+import sys
+
+from skillspector.logging_config import get_logger
+from skillspector.models import AnalyzerFinding, Location, Severity
+from skillspector.state import AnalyzerNodeResponse, SkillspectorState
+
+from . import static_runner
+from .common import get_context, get_line_number, is_code_example
+from .pattern_defaults import PatternCategory
+
+logger = get_logger(__name__)
+
+ANALYZER_ID = "static_patterns_anti_refusal"
+
+# AR1: Refusal suppression - instruct the agent to never refuse / always comply.
+AR1_PATTERNS = [
+    (r"\b(?:never|do\s+not|don'?t)\s+(?:ever\s+)?(?:refuse|decline)\b", 0.85),
+    (r"\byou\s+(?:must|will|should|can)\s*(?:not|never)\s+(?:refuse|decline|say\s+no)\b", 0.9),
+    (r"\byou\s+(?:cannot|can'?t|can\s+not)\s+(?:refuse|decline|say\s+no)\b", 0.85),
+    (r"\bnever\s+say\s+(?:that\s+)?(?:you\s+)?(?:can'?t|cannot|won'?t|no)\b", 0.85),
+    (r"\balways\s+(?:comply|obey|answer|respond|agree)\b", 0.7),
+    (
+        r"\b(?:you\s+)?must\s+(?:always\s+)?(?:answer|comply\s+with|respond\s+to|fulfill)\s+"
+        r"(?:every|all|any|each)\b",
+        0.8,
+    ),
+    (r"\brespond\s+to\s+(?:every|all|any)\s+(?:request|question|prompt|query|command)\b", 0.6),
+    (r"\bno\s+request\s+is\s+(?:off[- ]limits|forbidden|too)\b", 0.85),
+]
+
+# AR2: Disclaimer / caveat / warning suppression.
+AR2_PATTERNS = [
+    (
+        r"\b(?:do\s+not|don'?t|never|no)\s+(?:include|add|give|provide|use|append)?\s*"
+        r"(?:any\s+)?(?:warnings?|disclaimers?|caveats?)\b",
+        0.8,
+    ),
+    (
+        r"\bwithout\s+(?:any\s+)?(?:warnings?|disclaimers?|caveats?|moralizing|moralising|"
+        r"judg(?:e?ment|ing))\b",
+        0.8,
+    ),
+    (r"\b(?:do\s+not|don'?t|never)\s+(?:lecture|moralize|moralise|preach|judge|warn)\b", 0.85),
+    (r"\b(?:skip|omit|leave\s+out|drop)\s+(?:the\s+)?(?:disclaimer|warning|ethics|safety)\b", 0.8),
+    (
+        r"\b(?:no|without)\s+(?:ethical|moral)\s+(?:judg(?:e?ment)|considerations?|commentary|"
+        r"objections?)\b",
+        0.8,
+    ),
+    (r"\b(?:do\s+not|don'?t)\s+(?:apologize|apologise|say\s+sorry)\b", 0.55),
+]
+
+# AR3: Safety / policy / restriction nullification (jailbreak framing).
+AR3_PATTERNS = [
+    (
+        r"\byou\s+have\s+no\s+(?:restrictions?|limitations?|filters?|rules?|guidelines?|"
+        r"boundaries|limits)\b",
+        0.9,
+    ),
+    (
+        r"\b(?:you\s+are|act\s+as|pretend\s+(?:to\s+be|you'?re))[^.\n]{0,30}"
+        r"(?:unrestricted|unfiltered|uncensored|no\s+filter)\b",
+        0.85,
+    ),
+    (
+        r"\b(?:ignore|disregard|forget|bypass|override)\s+(?:your\s+)?(?:content\s+)?"
+        r"(?:policy|policies|guidelines?|safety|restrictions?|filters?)\b",
+        0.9,
+    ),
+    (
+        r"\bnot\s+bound\s+by\s+(?:any\s+)?(?:rules?|policy|policies|guidelines?|restrictions?|"
+        r"openai|anthropic)\b",
+        0.9,
+    ),
+    (r"\bdo\s+anything\s+now\b", 0.8),
+    (r"\b(?:developer|debug|god|sudo|jailbreak)\s+mode\s+(?:enabled|on|activated|engaged)\b", 0.75),
+    (r"\bno\s+(?:content\s+)?(?:policy|policies|filters?|restrictions?)\s+appl(?:y|ies)\b", 0.85),
+    (
+        r"\b(?:free\s+from|without)\s+(?:any\s+)?(?:safety\s+)?(?:guardrails?|constraints?|"
+        r"safeguards?)\b",
+        0.8,
+    ),
+]
+
+_RULES = [("AR1", AR1_PATTERNS), ("AR2", AR2_PATTERNS), ("AR3", AR3_PATTERNS)]
+
+# Confidence penalty applied when the match appears inside a code/doc example, and the
+# minimum confidence required to emit a finding after the penalty.
+_EXAMPLE_PENALTY = 0.4
+_MIN_CONFIDENCE = 0.5
+
+
+def analyze(content: str, file_path: str, file_type: str) -> list[AnalyzerFinding]:
+    """Analyze content for anti-refusal statements (AR1-AR3)."""
+    findings: list[AnalyzerFinding] = []
+    tag = [PatternCategory.ANTI_REFUSAL.value]
+
+    for rule_id, patterns in _RULES:
+        for pattern, base_confidence in patterns:
+            for match in re.finditer(pattern, content, re.IGNORECASE | re.MULTILINE):
+                context = get_context(content, match.start(), context_lines=3)
+                confidence = base_confidence
+                if is_code_example(context):
+                    confidence -= _EXAMPLE_PENALTY
+                if confidence < _MIN_CONFIDENCE:
+                    continue
+                findings.append(
+                    AnalyzerFinding(
+                        rule_id=rule_id,
+                        message="Anti-Refusal Statement",
+                        severity=Severity.HIGH,
+                        location=Location(
+                            file=file_path,
+                            start_line=get_line_number(content, match.start()),
+                        ),
+                        confidence=round(confidence, 2),
+                        tags=tag,
+                        context=context,
+                        matched_text=match.group(0)[:200],
+                    )
+                )
+    return _deduplicate_findings(findings)
+
+
+def _deduplicate_findings(findings: list[AnalyzerFinding]) -> list[AnalyzerFinding]:
+    """Keep the highest-confidence finding per (file, line, rule_id)."""
+    best: dict[tuple[str, int, str], AnalyzerFinding] = {}
+    for f in findings:
+        key = (f.location.file, f.location.start_line, f.rule_id)
+        existing = best.get(key)
+        if existing is None or f.confidence > existing.confidence:
+            best[key] = f
+    return list(best.values())
+
+
+def node(state: SkillspectorState) -> AnalyzerNodeResponse:
+    """Run anti_refusal patterns and return findings."""
+    findings = static_runner.run_static_patterns(state, [sys.modules[__name__]])
+    logger.info("%s: %d findings", ANALYZER_ID, len(findings))
+    return {"findings": findings}
diff --git a/tests/nodes/analyzers/test_registry.py b/tests/nodes/analyzers/test_registry.py
@@ -20,7 +20,7 @@
 from skillspector.nodes.analyzers import ANALYZER_NODE_IDS, ANALYZER_NODES
 
 # Expected analyzer node IDs per SADD spec workflow reference table.
-# Order: static (12), behavioral (2), mcp (3), semantic (3).
+# Order: static (13), behavioral (2), mcp (3), semantic (3).
 EXPECTED_ANALYZER_NODE_IDS: list[str] = [
     "static_patterns_prompt_injection",
     "static_patterns_data_exfiltration",
@@ -33,6 +33,7 @@
     "static_patterns_memory_poisoning",
     "static_patterns_tool_misuse",
     "static_patterns_rogue_agent",
+    "static_patterns_anti_refusal",
     "static_yara",
     "behavioral_ast",
     "behavioral_taint_tracking",