Corrective RAG (CRAG)
Grade the retrieval before you trust it.
When to use
When you must strongly REDUCE hallucination: confidently-wrong answers are costly (medical, finance, legal, official support), or the knowledge base is often incomplete/stale so you need to "grade" relevance then rewrite the query / fall back to web. It is a safety layer you BOLT ON to any RAG. ❌ Skip it when wrong answers are cheap and you need low latency.
Real-world examples
- Medical/legal/finance assistant: where a confidently-wrong answer can cause real harm.
- Official brand support chatbot: must not fabricate policies/terms when docs are missing.
- Q&A over an often-stale knowledge base: grader detects gaps → web fallback for fresh info.
- A system that must dare to say "insufficient data" instead of guessing (e.g. compliance advice).
Diagram
Illustrative pipeline diagram; see the step-by-step description in the Pipeline flow section below.Pipeline flow
- 1Query → Retriever → Retrieved Docs
- 2Evaluator / Grader scores relevance
- 3IF CORRECT → LLM → Answer
- 4IF AMBIGUOUS → Query Rewriter → re-retrieve
- 5IF INCORRECT → Web Search Fallback → LLM → Answer
In plain words
Like a fact-checking editor standing in the middle: before you cite a document, they grade whether it’s TRULY relevant. Relevant → use it; ambiguous → tell you to rephrase; way off → send you to another source. Better to say "insufficient data" than let you cite junk and be confidently wrong.
Concept A–Z
RAG’s classic problem: if retrieval returns IRRELEVANT docs, the LLM still "tries" to answer → confident hallucination. Corrective RAG inserts a CHECKPOINT: after retrieval, a Grader (a light LLM or a classifier) scores whether the docs are TRULY relevant, then BRANCHES: (1) CORRECT → use them to answer; (2) AMBIGUOUS → rewrite the query and re-retrieve; (3) INCORRECT → drop the corpus docs and fall back to another source (web search). Philosophy: "don’t blindly trust what you retrieved — grade it first." It’s a cheap, effective way to cut hallucination sharply.
How it works
The Grader: scoring relevance
The grader returns a label (correct/ambiguous/incorrect) or a 0–1 score per doc vs the query.
- Can be a light LLM with a prompt "does this doc contain info to answer the question?" → yes/no/partly.
- Or a small classifier (cheap, fast) trained for relevance.
- Use two thresholds: above = correct, below = incorrect, in between = ambiguous.
Three corrective branches
Depending on the grade, the system self-corrects its path.
- CORRECT: refine to the relevant parts (knowledge refinement) → answer + cite.
- AMBIGUOUS: query rewrite (clarify/expand) → re-retrieve; may loop 1–2 times.
- INCORRECT: admit the corpus lacks it → web fallback (or return "insufficient data" instead of fabricating).
In-depth content of the 5 RAG architectures
Unlock the hands-on code, pro tips, security notes, real-project guidance, common pitfalls and glossary — for the Senior plan and above.
Requires sign-in + the Senior plan or above
Already have an eligible plan? Sign in to unlock right away.
Related architectures
Practice AI/RAG interviews
Thousands of IT interview questions + roadmaps — learn fast, get hired.
Start practicing