RAG is about Context not Vectors

Company

Composable

Date Published

May 1, 2024

Author

Eric Barroca

Word count

838

Language

English

Hacker News points

None

URL

becomposable.com/blog/rag-is-about-context

Summary

When working on architecture to deliver AI-powered features, Retrieval-Augmented Generation (RAG) is a variable expansion technique that adds variables to a prompt and populates them using data retrieved from a database or other source, creating context for the Large Language Model (LLM) to generate better answers. RAG is needed because LLMs are stateless and lack memory, so adding context into prompts simulates memory and knowledge, allowing the model to reason better and be more accurate. Vector Search is a related technique that uses high-dimensional representations of content to find similar content, but it's not a one-size-fits-all solution and should be used in conjunction with RAG to create an efficient strategy. The key to effective RAG is defining the context for each task and retrieving relevant data in a way that creates a precise match, rather than relying on vector search alone.