← All posts

Retrieval-augmented generation — RAG — has become the default architecture for enterprise AI applications that need to work with private or proprietary knowledge. Rather than relying solely on a model's training data, RAG retrieves relevant documents at query time and passes them to the model as context. The results are more accurate, more current and more trustworthy than pure generation.

But RAG has limits. And Microsoft's GraphRAG — released as open source in 2024 and now widely adopted — was designed to address them.

This post explains both architectures clearly, compares them honestly, and helps you decide which is right for your use case.


What is RAG?

Standard RAG works in two stages.

First, at indexing time, your documents are chunked into segments, passed through an embedding model, and stored as vectors in a vector database (Pinecone, Weaviate, pgvector and others are commonly used).

At query time, the user's question is embedded using the same model, and the database returns the most semantically similar chunks. Those chunks are injected into the prompt as context, and the language model generates an answer grounded in the retrieved material.

It is elegant, fast and well-understood. RAG is the right architecture for a large proportion of enterprise knowledge retrieval problems — and it is relatively straightforward to build.

Strengths

Weaknesses


What is GraphRAG?

GraphRAG, developed by Microsoft Research and released as an open-source project in mid-2024, takes a fundamentally different approach. Rather than chunking documents and embedding them directly, it first extracts a knowledge graph from the source corpus.

During indexing, a language model reads through your documents and extracts entities — people, organisations, concepts, events — and the relationships between them. These are stored as a graph (nodes and edges) alongside community summaries that describe clusters of related entities.

At query time, GraphRAG can operate in two modes:

Strengths

Weaknesses


Head-to-Head Summary

Standard RAG GraphRAG
Best for Document Q&A, search Complex reasoning, synthesis
Indexing cost Low High
Query speed Fast Slower (global search)
Multi-hop queries Weak Strong
Corpus-wide synthesis Poor Excellent
Setup complexity Low–Medium Medium–High
Ecosystem maturity Very mature Maturing rapidly

Which Should You Choose?

Choose standard RAG if: your users need to find specific information within documents, your document set changes frequently, or you are building a knowledge base, internal search tool or document Q&A system. RAG is the right default for the majority of enterprise knowledge retrieval problems.

Choose GraphRAG if: your use case requires reasoning across many documents simultaneously, users ask questions like "what is the relationship between X and Y?" or "summarise the key themes across our research", or you are working with a relatively stable corpus — technical documentation, research archives, regulatory libraries.

Consider a hybrid approach if: your application needs both fast local retrieval and deeper synthesis. This is increasingly the production pattern — GraphRAG for complex analytical queries, standard RAG for specific document lookups, with a router deciding which path to use at query time.


The Practical Reality

GraphRAG is not a replacement for standard RAG — it is an upgrade for a specific class of problem. Most enterprise applications start with standard RAG and encounter its limits only when users begin asking questions that require connecting information across many documents.

The signal to watch for is users rephrasing the same question multiple times, or adding qualifiers like "based on everything you know" — this is usually a sign that the retrieval architecture is not giving the model what it needs to answer well.

When you hit that ceiling, GraphRAG is worth the additional complexity.


Reinvently helps organisations design and implement retrieval architectures that match their actual use case. Get in touch.

← All posts