Company Data Deep Dive
Confident AI
blog content
35 blog posts published by month since the start of 2024. Start from a different year:
2024
2024
2025
12
12
10
10
8
8
6
6
4
4
2
2
0
0
Posts published
0
0
0
12
0
3
4
2
1
3
1
0
4
1
1
2
1
1-2024
1-2024
2-2024
2-2024
3-2024
3-2024
4-2024
4-2024
5-2024
5-2024
6-2024
6-2024
7-2024
7-2024
8-2024
8-2024
9-2024
9-2024
10-2024
10-2024
11-2024
11-2024
12-2024
12-2024
1-2025
1-2025
2-2025
2-2025
3-2025
3-2025
4-2025
4-2025
5-2025
5-2025
Blog URL
www.confident-ai.com/blog
Posts year-to-date
9
(12 posts by this month last year.)
Average posts per month since 2024
1.5
Post details (2024 to today)
Showing 1 to 35 of 35 entries
Search:
Title
Author
Date
Word count
HN points
The Ultimate LLM Evaluation Playbook: Why It Didn't Work For…
Jeffrey Ip
May 03, 2025
3973
-
The G-Eval Guide to LLM Evaluation: Simply Explained
Kritin Vongthongsri
Apr 30, 2025
3925
-
Top LLM Evaluators for Testing LLM Systems at Scale
Jeffrey Ip
Apr 22, 2025
3227
-
How I raised Confident AI's $2.2M seed round in 5 days
Jeffrey Ip
Mar 20, 2025
1962
4
How I Built Deterministic LLM Evaluation Metrics for De…
Jeffrey Ip
Feb 09, 2025
2335
-
LLM Agent Evaluation: Assessing Tool Use, Task Completion, A…
Kritin Vongthongsri
Jan 31, 2025
2702
-
LLM Guardrails: The Ultimate Guide to Safeguard LLM Sys…
Jeffrey Ip
Jan 26, 2025
3024
-
OWASP Top 10 2025 for LLM Applications: What’s new? Risks, a…
Kritin Vongthongsri
Jan 19, 2025
3590
-
The People's Choice of Top LLM Evaluation Tools in 2025
Jeffrey Ip
Jan 18, 2025
1829
-
The Comprehensive LLM Safety Guide: Navigate AI regulations …
Kritin Vongthongsri
Nov 03, 2024
2342
-
How to Jailbreak LLMs One Step at a Time: Top Techniques and…
Kritin Vongthongsri
Oct 30, 2024
2206
-
What is LLM Observability? - The Ultimate LLM Monitoring Gui…
Kritin Vongthongsri
Oct 30, 2024
2694
-
LLM Chatbot Evaluation Explained: Top Metrics and Testing Te…
Jeffrey Ip
Oct 05, 2024
2365
3
Leveraging LLM-as-a-Judge for Automated and Scalable Evaluat…
Jeffrey Ip
Sep 24, 2024
2508
-
LLM Benchmarks: Everything on MMLU, HellaSwag, BBH, and Beyo…
Kritin Vongthongsri
Aug 19, 2024
2266
1
The Comprehensive Guide to LLM Security
Kritin Vongthongsri
Aug 19, 2024
2366
1
An Introduction to LLM Red Teaming
Kritin Vongthongsri
Jul 30, 2024
2365
-
An Introduction to LLM Benchmarking
Jeffrey Ip
Jul 17, 2024
2911
-
Evaluating LLM Systems: Essential Metrics, Benchmarks, and B…
Jeffrey Ip
Jul 17, 2024
3747
-
LLM Evaluation Metrics: The Ultimate LLM Evaluation Guide
Jeffrey Ip
Jul 09, 2024
4321
7
How to Build an LLM Evaluation Framework, from Scratch
Jeffrey Ip
Jun 24, 2024
2342
2
LLM Testing in 2024: Top Methods and Strategies
Jeffrey Ip
Jun 24, 2024
1958
1
Using LLMs for Synthetic Data Generation: The Definitive Gui…
Kritin Vongthongsri
Jun 11, 2024
1744
1
The Ultimate Guide to Fine-Tune LLaMA 3, With LLM Evaluation…
Jeffrey Ip
Apr 19, 2024
1691
-
RAG Evaluation: The Definitive Guide to Unit Testing RAG in …
Jeffrey Ip
Apr 14, 2024
1722
4
How to Evaluate LLM Applications: The Complete Guide
Jeffrey Ip
Apr 06, 2024
2312
-
What is Retrieval Augmented Generation (RAG)?
Jeffrey Ip
Apr 06, 2024
1200
1
How to build a PDF QA chatbot using OpenAI and ChromaDB
Jeffrey Ip
Apr 06, 2024
1275
-
Why we replaced Pinecone with PGVector
Jeffrey Ip
Apr 06, 2024
1016
3
Building a customer support chatbot using GPT-3.5 and lLamaI…
Jeffrey Ip
Apr 06, 2024
1329
-
Generating synthetic data with LLMs - Part 1
Jeffrey Ip
Apr 06, 2024
793
-
A Gentle Introduction to LLM Evaluation
Jeffrey Ip
Apr 06, 2024
1883
-
A Step-By-Step Guide to Evaluating an LLM Text Summarization…
Jeffrey Ip
Apr 06, 2024
1443
3
Become a Prompt Artist: Understanding the Midjourney LLM
Jeffrey Ip
Apr 06, 2024
1700
-
Why OpenAI Assistants is a Big Win for LLM Evaluation
Jeffrey Ip
Apr 06, 2024
1169
-
«
‹
1
›
»