Home / Companies / Zilliz / Blog / Post Details
Content Deep Dive

RAG Without OpenAI: BentoML, OctoAI and Milvus

Blog post from Zilliz

Post Details
Company
Date Published
Author
By Yujian Tang
Word Count
2,820
Company Posts That Month
21
Language
English
Hacker News Points
-
Summary

This tutorial demonstrates how to build retrieval augmented generation (RAG) applications using large language models (LLMs) without relying on OpenAI. The process involves serving embeddings with BentoML, inserting data into a vector database for RAG, setting up an LLM for RAG, and providing instructions to the LLM. Key components include BentoML for serving embeddings, OctoAI for accessing open-source models, and Milvus as the vector database. The example uses BentoML's Sentence Transformers Embeddings repository, a local Milvus instance using Docker Compose, and the Nous Hermes fine-tuned Mixtral model from OctoAI for RAG.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
Vector Search 51 2,613 257 91 +44%
RAG 24 1,795 223 72 +55%
LLM 15 3,398 379 136 +44%