From Text to a Knowledge Graph: The Information Extraction Pipeline

Company

Neo4j

Date Published

March 28, 2022

Author

Tomaž Bratanič

Word count

2261

Language

English

Hacker News points

None

URL

neo4j.com/blog/genai/text-to-knowledge-graph-information-extraction-pipeline

Summary

Building an information extraction pipeline allows developers to transform unstructured text inputs into useful knowledge graphs. This is achieved by applying Natural Language Processing (NLP) techniques, including coreference resolution, named entity recognition, and relationship extraction. The pipeline consists of four steps: inputting text, performing NLP techniques, storing the results in a graph database, and constructing a knowledge graph. The goal is to create a comprehensive knowledge graph that can be used for various applications, such as drug repurposing or analyzing biomedical concepts. By using NLP models trained on datasets like Wiki80, developers can extract relationships between entities and predict new use cases for existing drugs. This expands the scope of data that can be imported into a knowledge graph, making it more powerful and useful.