Retrieval-augmented generation — RAG — has become the default architecture for enterprise AI applications that need to work with private or proprietary knowledge. Rather than relying solely on a model's training data, RAG retrieves relevant documents at query time and passes them to the model as context. The results are more accurate, more current and more trustworthy than pure generation.
But RAG has limits. And Microsoft's GraphRAG — released as open source in 2024 and now widely adopted — was designed to address them.
This post explains both architectures clearly, compares them honestly, and helps you decide which is right for your use case.
What is RAG?
Standard RAG works in two stages.
First, at indexing time, your documents are chunked into segments, passed through an embedding model, and stored as vectors in a vector database (Pinecone, Weaviate, pgvector and others are commonly used).
At query time, the user's question is embedded using the same model, and the database returns the most semantically similar chunks. Those chunks are injected into the prompt as context, and the language model generates an answer grounded in the retrieved material.
It is elegant, fast and well-understood. RAG is the right architecture for a large proportion of enterprise knowledge retrieval problems — and it is relatively straightforward to build.
Strengths
- Fast to implement and iterate on
- Works well for document search and question-answering over large corpora
- Scales well with document volume
- Retrieval quality is easy to evaluate and improve
- Wide ecosystem of tools, frameworks (LangChain, LlamaIndex) and managed services
Weaknesses
- Retrieves chunks, not concepts — misses connections between pieces of information spread across multiple documents
- Poor at answering questions that require synthesising information from many sources simultaneously
- Loses structural and relational context between entities
- "Needle in a haystack" retrieval depends heavily on the quality of the query embedding
What is GraphRAG?
GraphRAG, developed by Microsoft Research and released as an open-source project in mid-2024, takes a fundamentally different approach. Rather than chunking documents and embedding them directly, it first extracts a knowledge graph from the source corpus.
During indexing, a language model reads through your documents and extracts entities — people, organisations, concepts, events — and the relationships between them. These are stored as a graph (nodes and edges) alongside community summaries that describe clusters of related entities.
At query time, GraphRAG can operate in two modes:
- Local search — traverses the graph from entities relevant to the query, pulling in their relationships and associated source text
- Global search — generates answers by reasoning over the community summaries, enabling queries that require synthesising information across the entire corpus
Strengths
- Excels at complex, multi-hop questions that require connecting information across documents
- Preserves relationships and context that vector search discards
- Global search enables genuine corpus-wide synthesis — "What are the main themes across all our research?" becomes answerable
- Reduces hallucination on relational queries
Weaknesses
- Significantly more expensive to index — LLM calls required to extract the graph
- Indexing is slower — not suitable for rapidly changing document sets without incremental graph updates
- More complex to deploy and maintain
- Overkill for simple question-answering use cases
- Graph quality depends on the accuracy of the extraction model
Head-to-Head Summary
| Standard RAG | GraphRAG | |
|---|---|---|
| Best for | Document Q&A, search | Complex reasoning, synthesis |
| Indexing cost | Low | High |
| Query speed | Fast | Slower (global search) |
| Multi-hop queries | Weak | Strong |
| Corpus-wide synthesis | Poor | Excellent |
| Setup complexity | Low–Medium | Medium–High |
| Ecosystem maturity | Very mature | Maturing rapidly |
Which Should You Choose?
Choose standard RAG if: your users need to find specific information within documents, your document set changes frequently, or you are building a knowledge base, internal search tool or document Q&A system. RAG is the right default for the majority of enterprise knowledge retrieval problems.
Choose GraphRAG if: your use case requires reasoning across many documents simultaneously, users ask questions like "what is the relationship between X and Y?" or "summarise the key themes across our research", or you are working with a relatively stable corpus — technical documentation, research archives, regulatory libraries.
Consider a hybrid approach if: your application needs both fast local retrieval and deeper synthesis. This is increasingly the production pattern — GraphRAG for complex analytical queries, standard RAG for specific document lookups, with a router deciding which path to use at query time.
The Practical Reality
GraphRAG is not a replacement for standard RAG — it is an upgrade for a specific class of problem. Most enterprise applications start with standard RAG and encounter its limits only when users begin asking questions that require connecting information across many documents.
The signal to watch for is users rephrasing the same question multiple times, or adding qualifiers like "based on everything you know" — this is usually a sign that the retrieval architecture is not giving the model what it needs to answer well.
When you hit that ceiling, GraphRAG is worth the additional complexity.
Reinvently helps organisations design and implement retrieval architectures that match their actual use case. Get in touch.
← All posts