Company
Date Published
Aug. 5, 2024
Author
Ofer Mendelevitch & Forrest Bao & Miaoran Li & Rogger Luo
Word count
1634
Language
English
Hacker News points
2

Summary

HHEM-2.1 is an improved version of the previous model HHEM-2.0, which outperforms both GPT-3.5-Turbo and GPT-4 for hallucination detection in three languages: English, French, and German. The new model has been integrated into Vectara's RAG-as-a-service platform and is automatically included with every call to the Query API, making it easy for enterprise developers to build trusted GenAI applications. HHEM-2.1 also offers a more accurate hallucination detection performance compared to its predecessors, with a better recall and precision in identifying hallucinations where they occur. The model has been benchmarked against other popular LLMs, including GPT-3.5-Turbo and GPT-4, and outperforms them in terms of F1 score, precision, and recall. Additionally, HHEM-2.1 is now available as an open-source model on Hugging Face and Kaggle, offering developers a more accessible option for building trusted GenAI applications. The new model also powers a revamped HHEM leaderboard that ranks LLMs based on their likelihood to hallucinate, providing a more accurate reflection of the true hallucination rate of LLMs.