Locally running RAG pipeline with Verba and Llama3 with Ollama
Retrieval Augmented Generation (RAG) is a design pattern that developers are implementing into their applications to leverage large language models with contextual data not part of the model's training dataset. Weaviate has built Verba, an example of RAG implementation, showcasing various chunking and retrieval techniques for building Retrieval Augmented Generation applications. This blog post explores how to run Verba locally with Weaviate Cloud or a local instance of Weaviate, and then connect it all up to Ollama for local inference. It also discusses alternative deployment methods for Weaviate, such as using Weaviate Cloud, Weaviate Enterprise Cloud, Dockerized Weaviate, and Kubernetes environment with Helm charts.
Company
Weaviate
Date published
July 9, 2024
Author(s)
Adam Chan
Word count
1974
Language
English
Hacker News points
None found.