Confident AI Blog

35 blog posts published by month since the start of 2024. Start from a different year: 2024
2024
2025

Blog URL

www.confident-ai.com/blog

Posts year-to-date

9 (12 posts by this month last year.)

Average posts per month since 2024

1.5

Post details (2024 to today)

Showing 1 to 35 of 35 entries

Search:

Title	Author	Date	Word count	HN points
The Ultimate LLM Evaluation Playbook: Why It Didn't Work For…	Jeffrey Ip	May 03, 2025	3973	-
The G-Eval Guide to LLM Evaluation: Simply Explained	Kritin Vongthongsri	Apr 30, 2025	3925	-
Top LLM Evaluators for Testing LLM Systems at Scale	Jeffrey Ip	Apr 22, 2025	3227	-
How I raised Confident AI's $2.2M seed round in 5 days	Jeffrey Ip	Mar 20, 2025	1962	4
How I Built Deterministic LLM Evaluation Metrics for De…	Jeffrey Ip	Feb 09, 2025	2335	-
LLM Agent Evaluation: Assessing Tool Use, Task Completion, A…	Kritin Vongthongsri	Jan 31, 2025	2702	-
LLM Guardrails: The Ultimate Guide to Safeguard LLM Sys…	Jeffrey Ip	Jan 26, 2025	3024	-
OWASP Top 10 2025 for LLM Applications: What’s new? Risks, a…	Kritin Vongthongsri	Jan 19, 2025	3590	-
The People's Choice of Top LLM Evaluation Tools in 2025	Jeffrey Ip	Jan 18, 2025	1829	-
The Comprehensive LLM Safety Guide: Navigate AI regulations …	Kritin Vongthongsri	Nov 03, 2024	2342	-
How to Jailbreak LLMs One Step at a Time: Top Techniques and…	Kritin Vongthongsri	Oct 30, 2024	2206	-
What is LLM Observability? - The Ultimate LLM Monitoring Gui…	Kritin Vongthongsri	Oct 30, 2024	2694	-
LLM Chatbot Evaluation Explained: Top Metrics and Testing Te…	Jeffrey Ip	Oct 05, 2024	2365	3
Leveraging LLM-as-a-Judge for Automated and Scalable Evaluat…	Jeffrey Ip	Sep 24, 2024	2508	-
LLM Benchmarks: Everything on MMLU, HellaSwag, BBH, and Beyo…	Kritin Vongthongsri	Aug 19, 2024	2266	1
The Comprehensive Guide to LLM Security	Kritin Vongthongsri	Aug 19, 2024	2366	1
An Introduction to LLM Red Teaming	Kritin Vongthongsri	Jul 30, 2024	2365	-
An Introduction to LLM Benchmarking	Jeffrey Ip	Jul 17, 2024	2911	-
Evaluating LLM Systems: Essential Metrics, Benchmarks, and B…	Jeffrey Ip	Jul 17, 2024	3747	-
LLM Evaluation Metrics: The Ultimate LLM Evaluation Guide	Jeffrey Ip	Jul 09, 2024	4321	7
How to Build an LLM Evaluation Framework, from Scratch	Jeffrey Ip	Jun 24, 2024	2342	2
LLM Testing in 2024: Top Methods and Strategies	Jeffrey Ip	Jun 24, 2024	1958	1
Using LLMs for Synthetic Data Generation: The Definitive Gui…	Kritin Vongthongsri	Jun 11, 2024	1744	1
The Ultimate Guide to Fine-Tune LLaMA 3, With LLM Evaluation…	Jeffrey Ip	Apr 19, 2024	1691	-
RAG Evaluation: The Definitive Guide to Unit Testing RAG in …	Jeffrey Ip	Apr 14, 2024	1722	4
How to Evaluate LLM Applications: The Complete Guide	Jeffrey Ip	Apr 06, 2024	2312	-
What is Retrieval Augmented Generation (RAG)?	Jeffrey Ip	Apr 06, 2024	1200	1
How to build a PDF QA chatbot using OpenAI and ChromaDB	Jeffrey Ip	Apr 06, 2024	1275	-
Why we replaced Pinecone with PGVector	Jeffrey Ip	Apr 06, 2024	1016	3
Building a customer support chatbot using GPT-3.5 and lLamaI…	Jeffrey Ip	Apr 06, 2024	1329	-
Generating synthetic data with LLMs - Part 1	Jeffrey Ip	Apr 06, 2024	793	-
A Gentle Introduction to LLM Evaluation	Jeffrey Ip	Apr 06, 2024	1883	-
A Step-By-Step Guide to Evaluating an LLM Text Summarization…	Jeffrey Ip	Apr 06, 2024	1443	3
Become a Prompt Artist: Understanding the Midjourney LLM	Jeffrey Ip	Apr 06, 2024	1700	-
Why OpenAI Assistants is a Big Win for LLM Evaluation	Jeffrey Ip	Apr 06, 2024	1169	-

Confident AI blog content

35 blog posts published by month since the start of 2024. Start from a different year: 202420242025

Post details (2024 to today)

35 blog posts published by month since the start of 2024. Start from a different year: 2024
2024
2025