This project allows users to upload a PDF and engage in Q&A sessions about its contents, utilizing open-source LLMs. The application is built on top of Falcon, Chroma, and langchain, enabling seamless streaming of answers and supporting concurrent users. Users can choose from two modes for question-answering: Basic Mode for straightforward questions and Conversational Mode for more complex queries with historical context. The project uses fine-tuned Falcon models, including the 7B model, which can be hosted on a 24GB GPU machine, and provides features such as repetition penalty and randomness control to enhance the answering process. The application is open-source, requiring no OpenAI API Key, and can be launched from a local machine or hosted on Lambda Cloud for demonstration purposes.