27 Hacker News submissions by month with at least  points since the start of

27 submissions with 1 points or greater

HN Points HN Title (Links to original post) Submitted Date
35 Unit Test LlamaIndex with DeepEval 2023-08-28
9 Tackling the Weaknesses of BertScore 2023-08-16
2 Auto-Evaluation of LLMs with DeepEval 2023-09-01
2 DeepEval GuardRails – AI Alignment 2023-09-30
2 Test for LLM Hallucinations 2023-08-31
2 Framework for evaluating LLM outputs with ML models 2023-08-25
2 How to test LLM is non-toxic before pushing to prod 2023-08-22
1 Testing for Image Similarity with DeepEval 2023-10-02
1 Evaluating LLMs for Lawyers 2023-09-25
1 How to Evaluate LangChain QA Retrieval 2023-09-23
1 PDB Support for DeepEval 2023-09-07
1 Test for Bias After Finetuning LLMs 2023-09-02
1 Measure Answer Relevancy of LLMs 2023-09-02
1 Testing Rank Similarity for Rag 2023-08-26
7 Everything I know about LLM evaluation metrics 2024-01-24
4 Best Practices for Unit Testing RAG Systems in Prod 2024-02-06
3 We Replaced Pinecone with PGVector 2023-11-01
3 How to evaluate multi-turn LLM chatbots 2024-10-08
3 I used QAG to implement an LLM text summarization evals 2023-12-19
2 How to build your own LLM evaluation framework 2024-04-15
1 We wrote a comprehensive guide on LLM security 2024-08-20
1 How to generate synthetic data using SOTA data evolution methods 2024-05-21
1 Overview of All Major LLM Benchmarks 2024-03-22
1 Best practices I learnt from helping health tech enterprise test LLMs 2024-02-27
1 What Is RAG? (With Examples) 2023-12-01
1 Be confident about your LLM stack 2023-08-15
4 YC helped us raise our seed round in 5 days 2025-03-20