RAG Without OpenAI: BentoML, OctoAI and Milvus

Company

Zilliz

Date Published

April 23, 2024

Author

By Yujian Tang

Word count

2820

Language

English

Hacker News points

None

URL

zilliz.com/blog/rag-without-open-ai-bentoml-octoai-milvus

Summary

This tutorial demonstrates how to build retrieval augmented generation (RAG) applications using large language models (LLMs) without relying on OpenAI. The process involves serving embeddings with BentoML, inserting data into a vector database for RAG, setting up an LLM for RAG, and providing instructions to the LLM. Key components include BentoML for serving embeddings, OctoAI for accessing open-source models, and Milvus as the vector database. The example uses BentoML's Sentence Transformers Embeddings repository, a local Milvus instance using Docker Compose, and the Nous Hermes fine-tuned Mixtral model from OctoAI for RAG.