Top 5 RAG
04 · Intermediate → Advanced

Corrective RAG (CRAG)

Grade the retrieval before you trust it.

When to use

When you must strongly REDUCE hallucination: confidently-wrong answers are costly (medical, finance, legal, official support), or the knowledge base is often incomplete/stale so you need to "grade" relevance then rewrite the query / fall back to web. It is a safety layer you BOLT ON to any RAG. ❌ Skip it when wrong answers are cheap and you need low latency.

Real-world examples

  • Medical/legal/finance assistant: where a confidently-wrong answer can cause real harm.
  • Official brand support chatbot: must not fabricate policies/terms when docs are missing.
  • Q&A over an often-stale knowledge base: grader detects gaps → web fallback for fresh info.
  • A system that must dare to say "insufficient data" instead of guessing (e.g. compliance advice).

Diagram

Illustrative pipeline diagram; see the step-by-step description in the Pipeline flow section below.

Pipeline flow

  1. 1Query → Retriever → Retrieved Docs
  2. 2Evaluator / Grader scores relevance
  3. 3IF CORRECT → LLM → Answer
  4. 4IF AMBIGUOUS → Query Rewriter → re-retrieve
  5. 5IF INCORRECT → Web Search Fallback → LLM → Answer

In plain words

Like a fact-checking editor standing in the middle: before you cite a document, they grade whether it’s TRULY relevant. Relevant → use it; ambiguous → tell you to rephrase; way off → send you to another source. Better to say "insufficient data" than let you cite junk and be confidently wrong.

Concept A–Z

RAG’s classic problem: if retrieval returns IRRELEVANT docs, the LLM still "tries" to answer → confident hallucination. Corrective RAG inserts a CHECKPOINT: after retrieval, a Grader (a light LLM or a classifier) scores whether the docs are TRULY relevant, then BRANCHES: (1) CORRECT → use them to answer; (2) AMBIGUOUS → rewrite the query and re-retrieve; (3) INCORRECT → drop the corpus docs and fall back to another source (web search). Philosophy: "don’t blindly trust what you retrieved — grade it first." It’s a cheap, effective way to cut hallucination sharply.

How it works

The Grader: scoring relevance

The grader returns a label (correct/ambiguous/incorrect) or a 0–1 score per doc vs the query.

  • Can be a light LLM with a prompt "does this doc contain info to answer the question?" → yes/no/partly.
  • Or a small classifier (cheap, fast) trained for relevance.
  • Use two thresholds: above = correct, below = incorrect, in between = ambiguous.

Three corrective branches

Depending on the grade, the system self-corrects its path.

  • CORRECT: refine to the relevant parts (knowledge refinement) → answer + cite.
  • AMBIGUOUS: query rewrite (clarify/expand) → re-retrieve; may loop 1–2 times.
  • INCORRECT: admit the corpus lacks it → web fallback (or return "insufficient data" instead of fabricating).

In-depth content of the 5 RAG architectures

Unlock the hands-on code, pro tips, security notes, real-project guidance, common pitfalls and glossary — for the Senior plan and above.

Requires sign-in + the Senior plan or above

Already have an eligible plan? Sign in to unlock right away.

Related architectures

Practice AI/RAG interviews

Thousands of IT interview questions + roadmaps — learn fast, get hired.

Start practicing