GraphRAG
Answers live in the relationships.
When to use
When answers require CONNECTING scattered facts across documents ("who worked on project X using tech Y at company Z?"), reasoning over relationships/entities (org charts, dependencies, citations, supply chains), or a global view ("what are the main themes across the whole corpus?"). ❌ Overkill when the answer fits in a single passage — the graph-building cost is not worth it.
Real-world examples
- Investigation/legal: trace person–company–transaction relationships to surface hidden links.
- M&A due-diligence: "who does this company indirectly own / connect to?" across many layers.
- Codebase analysis: trace dependency chains between modules/functions to assess change impact.
- Medical records: connect symptom – diagnosis – medication across a patient’s many visits.
Diagram
Illustrative pipeline diagram; see the step-by-step description in the Pipeline flow section below.Pipeline flow
- 1Query → Entity Extractor
- 2Knowledge Graph: nodes = entities (Person, Company, Project, Tech, Location), edges = relationships (works_at, located_in, related_to)
- 3Subgraph Retrieval (relevant graph region)
- 4Community Summaries (cluster summaries)
- 5LLM → Answer
In plain words
Like a detective’s case board: suspect photos pinned to a wall, linked by strings labeled "knows", "worked with". To answer "who is connected to whom", the detective FOLLOWS the strings — instead of reading each file in isolation. GraphRAG builds exactly that "string wall" over your documents.
Concept A–Z
Vector RAG only finds passages "similar" to the query — it cannot CONNECT facts spread across different passages. Ask "Which engineers once shared a project with An and now work at a competitor?" and vectors are stuck. GraphRAG (popularized by Microsoft Research) solves this: use an LLM to EXTRACT entities + relationships from documents → build a KNOWLEDGE GRAPH (nodes = entities, edges = relationships). At query time: find entities in the question → fetch the graph region around them (subgraph), traverse edges to gather connected facts. Add community detection (Leiden) to cluster nodes + summarize each cluster → answers "global, whole-corpus" questions that vectors cannot.
How it works
Indexing: building the graph (the costly part)
This is the heavy (cost/time) step: run an LLM over each chunk to extract triples (subject, relation, object) + attributes, resolve duplicate entities, then store in a graph DB.
- Extract entities + relations with an LLM following your schema (Person, Org, Tech…).
- Entity resolution: merge "Nguyen An", "An", "Mr. An" into one node — this decides overall quality.
- Community detection (Leiden) clusters nodes; an LLM summarizes each cluster (bottom-up) to serve global questions.
Query: local vs global
GraphRAG has two distinct answering modes.
- Local search: question about a specific entity → find the node → expand 1–2 hops along edges (multi-hop) → feed subgraph + source chunks to the LLM.
- Global search: broad question ("main trends?") → scan community summaries, map-reduce to synthesize a corpus-wide answer.
- Often COMBINED with vectors (hybrid): vectors find the entry point, the graph expands relationships.
In-depth content of the 5 RAG architectures
Unlock the hands-on code, pro tips, security notes, real-project guidance, common pitfalls and glossary — for the Senior plan and above.
Requires sign-in + the Senior plan or above
Already have an eligible plan? Sign in to unlock right away.
Related architectures
Practice AI/RAG interviews
Thousands of IT interview questions + roadmaps — learn fast, get hired.
Start practicing