Contextual Retrieval

Contextual Retrieval is a technique Anthropic introduced in September 2024 to fix a quiet but common weakness in retrieval-augmented generation. To search a knowledge base, documents are split into small chunks, but a chunk taken out of context can be ambiguous. A sentence saying “revenue grew 3 percent” loses its meaning when separated from the document that named the company and the quarter, so the right chunk often fails to be retrieved.

The method asks a model, in this case Claude, to generate a short piece of explanatory context for each chunk describing where it sits in the larger document, and prepends that context before the chunk is embedded and indexed. It applies the same idea to keyword search through a contextual BM25 index. Anthropic reported that contextual embeddings alone reduced the top-20 retrieval failure rate by 35 percent, from 5.7 percent to 3.7 percent, and combined with contextual BM25 the reduction reached 49 percent, down to 2.9 percent. Adding a reranking step pushed the total reduction to 67 percent, a 1.9 percent failure rate.

For a business, Contextual Retrieval is a low-effort, high-impact upgrade to a RAG system: spending a little compute to enrich chunks at indexing time meaningfully raises the odds that the right information is found at query time.

Sources

Related