This article demonstrates how to implement a local Retrieval-Augmented Generation (RAG)-based chatbot in Python using open source components such as Ollama for language models and Weaviate vector database via Docker. The process involves setting up the local LLM and embedding models with Ollama, hosting a local vector database instance with Docker, and building a local RAG pipeline. This approach ensures data privacy by keeping everything on-premises without any dependencies on external services or API keys.