Skip to content

feat: Add GroundednessChecker — runtime groundedness guardrail for RAG pipelines#11031

Open
JohnnyTarrr wants to merge 2 commits intodeepset-ai:mainfrom
JohnnyTarrr:feat/groundedness-checker
Open

feat: Add GroundednessChecker — runtime groundedness guardrail for RAG pipelines#11031
JohnnyTarrr wants to merge 2 commits intodeepset-ai:mainfrom
JohnnyTarrr:feat/groundedness-checker

Conversation

@JohnnyTarrr
Copy link
Copy Markdown

Context

Addresses #10973 — runtime groundedness verification for RAG pipelines.

The existing FaithfulnessEvaluator is designed for offline batch evaluation. This PR adds a runtime validator that sits inside a live pipeline and actively intervenes on each query — extracting claims, cross-referencing against retrieved documents, and flagging or stripping unsupported content before it reaches the user.

Built by the team at VeroQ, where we work on LLM output verification. We also offer a self-hosted Shield that provides both groundedness and factual verification as a Docker container: veroq-ai/veroq-self-hosted

What it does

GroundednessChecker is a new component in haystack.components.validators that:

  1. Takes replies (from a Generator) and documents (from a Retriever)
  2. Uses a ChatGenerator to extract verifiable claims from the reply
  3. Cross-references each claim against the document context
  4. Returns per-claim verdicts (supported / contradicted / unverifiable)
  5. Computes an overall trust_score (0-1)
  6. Optionally replaces contradicted claims with corrections (block_contradicted=True)
from haystack.components.validators import GroundednessChecker

checker = GroundednessChecker(
    max_claims=5,
    block_contradicted=True,
)
pipeline.connect("generator.replies", "checker.replies")
pipeline.connect("retriever.documents", "checker.documents")

Output

{
    "verified_replies": ["Revenue was [CORRECTED: $2.1B] in Q3."],
    "trust_score": 0.0,
    "verdict": "has_contradictions",
    "claims": [
        {"claim": "Revenue was $2.4B", "verdict": "contradicted",
         "explanation": "Context says $2.1B", "correction": "Revenue was $2.1B"}
    ],
    "is_trusted": False,
}

Design decisions

  • Validator, not Evaluator: Lives in validators/ because it's a runtime guardrail that modifies pipeline output, not an offline metric calculator.
  • Uses ChatGenerator protocol: Works with any LLM (OpenAI, Ollama, Azure, etc.). Defaults to gpt-4o-mini with JSON mode.
  • Two-step LLM approach: First call extracts claims, second call verifies them against context. This separation makes verification more reliable than single-pass approaches.
  • Idempotent warm_up: Follows the _is_warmed_up pattern from LLMRanker.
  • Prompt injection defense: XML fencing around user-controlled content + explicit anti-injection directives in system prompts.

Tests

18 test methods covering:

  • Default/custom initialization, parameter clamping
  • No documents, empty documents, empty replies
  • Serialization round-trip (to_dict + from_dict)
  • warm_up delegation and idempotency
  • All-supported, contradicted, and blocked verdicts
  • Multiple replies in a single call
  • raise_on_failure (both True and False paths)
  • Malformed JSON from LLM (graceful degradation)

Checklist

  • New component in haystack/components/validators/
  • Registered in validators/__init__.py
  • 18 unit tests
  • Follows Haystack conventions (type annotations, docstrings, serialization, SPDX headers)
  • Works with any ChatGenerator
  • No external dependencies beyond Haystack

Adds a new validator component that verifies generated replies are grounded
in retrieved documents at runtime, not just during offline evaluation.

Addresses deepset-ai#10973.

The component:
- Sits after a Generator in a pipeline
- Extracts factual claims from the generated reply using an LLM
- Cross-references each claim against retrieved documents
- Returns per-claim verdicts (supported/contradicted/unverifiable)
- Computes an overall trust score (0-1)
- Optionally strips contradicted claims from the output

Works with any ChatGenerator (OpenAI, Ollama, Azure, etc.).
@JohnnyTarrr JohnnyTarrr requested a review from a team as a code owner April 2, 2026 15:55
@JohnnyTarrr JohnnyTarrr requested review from bogdankostic and removed request for a team April 2, 2026 15:55
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 2, 2026

@JohnnyTarrr is attempting to deploy a commit to the deepset Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 2, 2026

CLA assistant check
All committers have signed the CLA.

@DYNOSuprovo
Copy link
Copy Markdown

Hi @JohnnyTarrr, this is a very clean initial implementation. Expanding Haystack’s guardrails this quickly is exactly why I opened the original issue!

I took a look through groundedness_checker.py and I wanted to flag a critical limitation regarding context handling that I think we need to solve before this is merged.

On Line 124, the component blindly concatenates all documents: context = "\n\n".join(doc.content for doc in documents if doc.content)

And on Line 191, it enforces an arbitrary hard truncation: context=context[:8000]

The Issue: This linear concatenation is highly vulnerable to the "Lost-in-the-Middle" degradation phenomenon (Liu et al., 2023). If a Generator outputs a valid claim, but the supporting Document happens to be placed in the dead-center of a 7,500-character prompt, the LLM judge will frequently fail to recognize the evidence and issue a False Negative (contradicted or unverifiable). Furthermore, truncating rigidly at [:8000] guarantees data destruction if the Retriever passes a large context window.

Proposed Solution (Positional Context Batching): In the custom pipeline I built that sparked this feature request, I solved this by treating the chunks non-linearly:

First, rank the documents array by relevance score.
If the total context size exceeds a safe threshold (e.g., 4000 tokens), batch the verification calls rather than truncating them.
Within each batch, explicitly order the chunks so that the documents with the highest relevance scores sit at position[0] and position[-1] of the string, exploiting the LLM's primacy and recency bias.
Truncating silently at 8000 characters inside a guardrail component without warning the user is very risky for a production pipeline, because if the evidence was at character index 8001, the guardrail will forcefully strip out a truthful, factually accurate answer.

I strongly recommend we update the _verify_claims method to implement contextual batching rather than hard truncation before this goes to main. Let me know if you’d like me to draft that logic!

@JohnnyTarrr
Copy link
Copy Markdown
Author

@DYNOSuprovo Spot on — the hard truncation is a known limitation. Your batching approach is the right fix.

Here's what I think the implementation looks like:

  1. If documents have a score in their metadata (from the Retriever), sort by score descending
  2. If total context exceeds a token threshold, split into batches
  3. Within each batch, place highest-score docs at position 0 and -1 (primacy/recency)
  4. Run verification per batch, merge results
  5. Replace the hard truncation with this batched approach

Since you've already built and tested this pattern, could you draft the _verify_claims update and drop it in the PR comments? I'll integrate it and push the update with co-author credit. Want to make sure we get this right before maintainer review.

@DYNOSuprovo
Copy link
Copy Markdown

Thanks @JohnnyTarrr! Happy to collaborate on this and get it bulletproof for the maintainers.

To implement the Positional Context Batching cleanly without over-complicating the Haystack Document structure, we can intercept the documents right before they get merged into the context string.

Here is a lightweight, production-ready Python implementation of the positional batching logic. Instead of arbitrarily truncating at 8000 characters, we sort by relevance, dynamically select the chunks, and explicitly construct the string to exploit LLM primacy/recency bias:

def _build_positional_context(self, documents: list[Document], max_chars: int = 8000) -> str:
    """
    Builds a context string from a list of documents, prioritizing the most relevant 
    documents by placing them at the extreme ends of the prompt (position 0 and -1) 
    to mitigate Lost-in-the-Middle context degradation.
    """
    if not documents:
        return ""

    # Step 1: Ensure documents are sorted by relevance
    # We use a stable sort so if scores are None, initial retriever order is preserved
    ranked_docs = sorted(documents, key=lambda d: getattr(d, 'score', 0.0) or 0.0, reverse=True)
    
    # Step 2: Select documents until we hit the char limit (preventing arbitrary mid-sentence truncation where possible)
    selected_docs = []
    current_len = 0
    for doc in ranked_docs:
        content = doc.content or ""
        doc_len = len(content)
        
        # Always include at least one document, but stop if adding the next breaches the limit
        if current_len + doc_len > max_chars and selected_docs:
            break 
            
        selected_docs.append(doc)
        current_len += doc_len + 2  # +2 accounts for the "\n\n" separator
        
    # Step 3: Positional Reordering 
    # Structure: [Most Relevant] -> [Least Relevant...] -> [Second Most Relevant]
    if len(selected_docs) >= 3:
        best_doc = selected_docs[0]
        second_best_doc = selected_docs[1]
        middle_docs = selected_docs[2:]
        
        ordered_docs = [best_doc] + middle_docs + [second_best_doc]
    else:
        ordered_docs = selected_docs

    context_str = "\n\n".join(d.content for d in ordered_docs if d.content)
    
    # Hard fallback truncation just in case the first document itself exceeded max_chars
    return context_str[:max_chars]

If you add this as a helper method on the class, you can just replace your context = "\n\n".join(...) line in run() with context = self._build_positional_context(documents), and update _VERIFY_PROMPT to remove the hard slice. It natively guarantees that the two most important documents will always anchor the extremities of the prompt!

Let me know if you run into any issues integrating this.

(Note for co-author creds, if you use them: you can use Co-authored-by: DYNOSuprovo DYNOSuprovo@users.noreply.github.com in your commit message!)

Replace hard truncation with positional context batching to mitigate
Lost-in-the-Middle degradation (Liu et al., 2023). Documents are sorted
by relevance score and reordered so the most relevant docs sit at the
start and end of the context window, exploiting LLM primacy/recency bias.

Co-authored-by: DYNOSuprovo <DYNOSuprovo@users.noreply.github.com>
@JohnnyTarrr
Copy link
Copy Markdown
Author

@DYNOSuprovo Integrated — just pushed your positional context batching logic. Clean implementation, works perfectly.

Changes in the latest commit:

  • Added _build_positional_context() method — sorts by score, selects within char budget, reorders with best docs at position[0] and position[-1]
  • Removed the hard [:8000] truncation from _verify_claims
  • Context is now built through the positional batching method before being passed to verification

You're credited as co-author on the commit. Thanks for the collaboration — the component is much stronger with this.

@sjrl @julian-risch this should be ready for review now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

topic:tests type:documentation Improvements on the docs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants