security(meta_analyzer): preserve high-severity static findings from LLM suppression by AbhiramDwivedi · Pull Request #54 · NVIDIA/SkillSpector

AbhiramDwivedi · 2026-06-14T19:33:12Z

Closes #59

What

meta_analyzer.apply_filter() previously kept a static finding only if the LLM echoed it back as is_vulnerability=True with confidence ≥ 0.6 — so any finding the LLM omitted or denied was silently dropped, regardless of severity.

Because the LLM's input includes attacker-controlled skill content, a prompt-injection payload could make the LLM drop a real CRITICAL/HIGH static finding, hiding it from the report (a false negative in a security gate). This affects all providers.

Fix (severity-gated floor)

CRITICAL/HIGH static findings are always kept. If the LLM confirms one, it's enriched as before; if not, the original finding is preserved and tagged llm-unconfirmed (now surfaced in Finding.to_dict() / JSON output).
MEDIUM/LOW keep the existing LLM false-positive filtering.
Rationale: in a security gate, a false negative (hiding a real CRITICAL) is far worse than a false positive.

Test

New tests cover CRITICAL/HIGH preserved+tagged, MEDIUM/LOW still filtered, confirmed-CRITICAL still enriched, and the tag surfacing via to_dict().

🤖 Generated with Claude Code

CRITICAL and HIGH static findings are now always kept in the output of apply_filter(), even when the LLM does not confirm them (omitted, denies the finding, or returns confidence < 0.6). MEDIUM/LOW findings continue to be filtered by the LLM for false-positive reduction as before. Motivation: the LLM receives attacker-controlled skill content. A prompt- injection payload embedded in a scanned skill could cause the LLM to drop a real CRITICAL or HIGH static finding, hiding it from the security report (a false negative in a security gate). This invariant applies to all providers. Implementation: - LLMMetaAnalyzer._HIGH_SEVERITY_FLOOR = frozenset({"CRITICAL", "HIGH"}). - In the "not confirmed" branch of apply_filter(), if the finding's severity is in the floor set, emit the original static finding unchanged and append the tag "llm-unconfirmed" so consumers can distinguish it from LLM-validated findings. The tag is not duplicated if already present. - Confirmed CRITICAL/HIGH findings are still enriched with LLM explanation/ remediation/confidence as before (no regression). - Finding.to_dict() now includes "tags", so the "llm-unconfirmed" marker is visible in the JSON report (tags were previously not serialized). Tests in tests/nodes/test_llm_analyzer_base.py: - CRITICAL/HIGH finding unconfirmed -> kept, "llm-unconfirmed" tag, original data. - MEDIUM/LOW finding unconfirmed -> still dropped (existing behaviour). - Confirmed CRITICAL -> enriched normally, tag absent. - Duplicate-tag guard -> "llm-unconfirmed" not appended twice. - to_dict surfacing -> marker present in JSON output. Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

AbhiramDwivedi mentioned this pull request Jun 14, 2026

security: LLM meta-analysis can suppress high-severity static findings via prompt injection #59

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security(meta_analyzer): preserve high-severity static findings from LLM suppression#54

security(meta_analyzer): preserve high-severity static findings from LLM suppression#54
AbhiramDwivedi wants to merge 1 commit into
NVIDIA:mainfrom
AbhiramDwivedi:pr/d-meta-analyzer-suppression-floor

AbhiramDwivedi commented Jun 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AbhiramDwivedi commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Fix (severity-gated floor)

Test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AbhiramDwivedi commented Jun 14, 2026 •

edited

Loading