Building A Graph & LLM-Powered RAG Application from PDF Documents

Company

Neo4j

Date Published

Jan. 19, 2024

Author

Fanghua Yu

Word count

1102

Language

English

Hacker News points

None

URL

neo4j.com/blog/developer/graph-llm-rag-application-pdf-documents

Summary

A field engineer at Neo4j has created a step-by-step walkthrough of building a Retrieval Augmented Generation (RAG) application from PDF documents using GenAI-Stack and OpenAI. The project leverages Neo4j AuraDB for knowledge storage, LLM Sherpa for PDF document parsing, and OpenAI models for embedding and text generation. The walkthrough covers key components such as PDF document parsing and content extraction, Neo4j AuraDB setup, Python data ingestion, Neo4j vector index for semantic search, GenAI-Stack for fast prototyping, and OpenAI models for embedding and text generation. The project demonstrates an end-to-end pipeline from parsing and ingesting PDF documents to knowledge graph creation and retrieving a graph for given natural language questions.