Confident AI Blog

32 blog posts published by month since the start of 2023. Start from a different year: 2023
2024
2025

Blog URL

Posts year-to-date

6 (12 posts by this month last year.)

Average posts per month since 2023

0.9

Post details (2023 to today)

Title	Author	Date	Word count	HN points
The Comprehensive Guide to LLM Security	Kritin Vongthongsri	Aug 19, 2024	2366	1
Evaluating LLM Systems: Essential Metrics, Benchmarks, and Best Practices	Jeffrey Ip	Jul 17, 2024	3747	-
Why OpenAI Assistants is a Big Win for LLM Evaluation	Jeffrey Ip	Apr 06, 2024	1169	-
Become a Prompt Artist: Understanding the Midjourney LLM	Jeffrey Ip	Apr 06, 2024	1700	-
LLM Testing in 2024: Top Methods and Strategies	Jeffrey Ip	Jun 24, 2024	1958	1
A Step-By-Step Guide to Evaluating an LLM Text Summarization Task	Jeffrey Ip	Apr 06, 2024	1443	3
A Gentle Introduction to LLM Evaluation	Jeffrey Ip	Apr 06, 2024	1883	-
Generating synthetic data with LLMs - Part 1	Jeffrey Ip	Apr 06, 2024	793	-
Building a customer support chatbot using GPT-3.5 and lLamaIndex	Jeffrey Ip	Apr 06, 2024	1329	-
Why we replaced Pinecone with PGVector	Jeffrey Ip	Apr 06, 2024	1016	3
Using LLMs for Synthetic Data Generation: The Definitive Guide	Kritin Vongthongsri	Jun 11, 2024	1744	1
An Introduction to LLM Red Teaming	Kritin Vongthongsri	Jul 30, 2024	2365	-
How to Build an LLM Evaluation Framework, from Scratch	Jeffrey Ip	Jun 24, 2024	2342	2
RAG Evaluation: The Definitive Guide to Unit Testing RAG in CI/CD	Jeffrey Ip	Apr 14, 2024	1722	4
LLM Evaluation Metrics: The Ultimate LLM Evaluation Guide	Jeffrey Ip	Jul 09, 2024	4321	7
An Introduction to LLM Benchmarking	Jeffrey Ip	Jul 17, 2024	2911	-
How to build a PDF QA chatbot using OpenAI and ChromaDB	Jeffrey Ip	Apr 06, 2024	1275	-
The Ultimate Guide to Fine-Tune LLaMA 3, With LLM Evaluations	Jeffrey Ip	Apr 19, 2024	1691	-
What is Retrieval Augmented Generation (RAG)?	Jeffrey Ip	Apr 06, 2024	1200	1
LLM Benchmarks: Everything on MMLU, HellaSwag, BBH, and Beyond	Kritin Vongthongsri	Aug 19, 2024	2266	1
How to Evaluate LLM Applications: The Complete Guide	Jeffrey Ip	Apr 06, 2024	2312	-
Leveraging LLM-as-a-Judge for Automated and Scalable Evaluation	Jeffrey Ip	Sep 24, 2024	2508	-
LLM Chatbot Evaluation Explained: Top Metrics and Testing Techniques	Jeffrey Ip	Oct 05, 2024	2365	3
What is LLM Observability? - The Ultimate LLM Monitoring Guide	Kritin Vongthongsri	Oct 30, 2024	2694	-
The Comprehensive LLM Safety Guide: Navigate AI regulations and Best Practices for LLM Safety	Kritin Vongthongsri	Nov 03, 2024	2342	-
How to Jailbreak LLMs One Step at a Time: Top Techniques and Strategies	Kritin Vongthongsri	Oct 30, 2024	2206	-
OWASP Top 10 2025 for LLM Applications: What’s new? Risks, and Mitigation Techniques	Kritin Vongthongsri	Jan 19, 2025	3590	-
The People's Choice of Top LLM Evaluation Tools in 2025	Jeffrey Ip	Jan 18, 2025	1829	-
LLM Guardrails: The Ultimate Guide to Safeguard LLM Systems	Jeffrey Ip	Jan 26, 2025	3024	-
LLM Agent Evaluation: Assessing Tool Use, Task Completion, Agentic Reasoning, and More	Kritin Vongthongsri	Jan 31, 2025	2702	-
How I Built Deterministic LLM Evaluation Metrics for DeepEval	Jeffrey Ip	Feb 09, 2025	2335	-
How I raised Confident AI's $2.2M seed round in 5 days	Jeffrey Ip	Mar 20, 2025	1962	4

Confident AI blog content

32 blog posts published by month since the start of 2023. Start from a different year: 202320242025

Post details (2023 to today)

32 blog posts published by month since the start of 2023. Start from a different year: 2023
2024
2025