/plushcap/analysis/zilliz/rag-evaluation-using-ragas

RAG Evaluation Using Ragas

What's this blog post about?

Retrieval Augmented Generation (RAG) is an approach to building AI-powered chatbots that answer questions based on data the model has been trained on. However, natural language retrieval accuracy remains low, necessitating experiments to tune RAG parameters before deployment. Large Language Models (LLMs) are increasingly being used as judges for modern RAG evaluation, automating and speeding up evaluation while offering scalability and saving time and cost spent on manual human labeling. Two primary flavors of LLM-as-judge for RAG evaluation include MT-Bench and Ragas, with the latter emphasizing automation and scalability for RAG evaluations. Key data points needed for Ragas evaluation include the question, contexts, answer, and ground truth answer.

Company
Zilliz

Date published
March 18, 2024

Author(s)
Christy Bergman

Word count
1018

Language
English

Hacker News points
2