Galileo Luna: Advancing LLM Evaluation Beyond GPT-3.5

Company

Galileo

Date Published

June 11, 2024

Author

Pratik Bhavsar

Word count

1065

Language

English

Hacker News points

None

URL

www.galileo.ai/blog/galileo-luna-breakthrough-in-llm-evaluation-beating-gpt-3-5-and-ragas

Summary

Galileo Luna is a family of Evaluation Foundation Models (EFM) fine-tuned specifically for hallucination detection in RAG settings, outperforming GPT-3.5 and commercial evaluation frameworks while significantly reducing cost and latency, making it an ideal candidate for industry LLM applications. Luna excels on the RAGTruth dataset and shows excellent generalization capabilities across various industries and use cases, including finance, numerical reasoning, biomedical research, legal, and general knowledge. The model is optimized to process up to 16k input tokens in under one second on cost general-purpose GPUs, achieving a 97% reduction in cost and a 96% reduction in latency compared to GPT-3.5-based approaches. Luna's dynamic windowing technique ensures comprehensive validation and significantly improves hallucination detection accuracy, making it a highly efficient solution for industry applications.