/plushcap/analysis/activeloop/activeloop-use-deep-memory-to-boost-rag-apps-accuracy-by-up-to-22

Use Deep Memory to Boost RAG Apps' Accuracy by up to +22%

What's this blog post about?

Retrieval Augmented Generation (RAG) systems, which provide context to Large Language Models (LLMs), are currently being used by enterprises for various applications such as documenting internal processes and customer support automation. However, the utility of these implementations depends on the retrieval accuracy, with RAG applications achieving a maximum accuracy rate of 70%. To improve this, several techniques like feature engineering, fine-tuning embeddings, hybrid or lexical search, reranking final results with cross encoders, and context-aware fine-tuning LLMs have been employed. However, these methods offer only marginal improvements that do not fundamentally change the user experience of using LLM apps. Deep Memory is a new solution that significantly increases Deep Lake's vector search accuracy up to +22% by learning an index from labeled queries tailored to your application without impacting search time. This can be achieved with just a few hundred example pairs of prompt embeddings and most relevant answers from the vector store. After training, vector search is used as usual. Deep Memory also allows for further improvement in search results by combining them with lexical search or rerankers. Deep Memory has been shown to improve retrieval accuracy without altering existing workflows and can significantly reduce inference costs via lower token usage. Health tech startup Munai, backed by the Bill & Melinda Gates Foundation, has achieved an 18.6% boost in vector search accuracy across medical documents using Deep Memory. The solution is now generally available with the latest Deep Lake version.

Company
Activeloop

Date published
Sept. 28, 2023

Author(s)
Davit Buniatyan

Word count
1294

Language
English

Hacker News points
None found.