Quickly Start Evaluating LLMs With OpenEvals

Company

LangChain

Date Published

Feb. 26, 2025

Author

Word count

857

Language

English

Hacker News points

None

URL

blog.langchain.dev/evaluating-llms-with-openevals

Summary

Evaluations are crucial for reliable LLM-powered applications or agents, but building evaluations from scratch can be challenging. Two new packages, openevals and agentevals, provide a set of evaluators and a common framework to help developers get started. These packages focus on releasing pre-built solutions that share common evaluation trends and best practices, making it easier for developers to create reliable evaluations. They cater to various use cases, including LLM-as-a-judge evaluations for natural language outputs, structured data evaluations for extracting information from documents, and agent evaluations for assessing trajectories of actions taken by agents. The packages also provide tools like LangSmith for tracking results over time and sharing them with a team.