48 |
Anthropic's Haiku Beats GPT-4 Turbo in Tool Use |
2024-04-08 |
3 |
A Systematic Workflow to Build Production-Ready LLM Applications |
2024-07-21 |
3 |
LLM evaluation metrics for RAG, chatbots and summarization |
2024-02-13 |
2 |
Generate synthetic data for Q&A tasks via instructor in TypeScript |
2024-07-10 |
2 |
Building and Evaluating Evals for Retrieval |
2024-03-05 |
1 |
Tactics for multi-step LLM app experimentation |
2024-07-25 |
1 |
Observability and Testing of OpenAI's Assistants API |
2024-03-30 |
1 |
LLM evals on labeled data |
2024-03-17 |
1 |
Reproducible LLM Experimentation with DVC |
2024-02-08 |