Company
Date Published
Author
Wolfgang Hoeck
Word count
2024
Language
English
Hacker News points
None

Summary

I built a knowledge graph from scratch without any source code, leveraging my interest in oncology and the desire to learn about cancer research. I initially thought of using a relational database but realized that it wouldn't be suitable for capturing knowledge, which requires labeling, giving context, and connecting dots - exactly what a graph represents. I represented my initial idea in a competitive intelligence graph, showing how companies design therapeutic molecules interacting with molecular targets. The graph model is basic but not the end of it. I wanted to get answers to simple and complex questions, such as looking up a molecule by name or identifying interactions between molecules and their target families. To capture knowledge, I created input screens for entering company information and potential link-outs to other data sources. I defined relationships with properties like "developed by company" or "has location in city." The graph grew over time, adding bioprocess, biological structure, and financial data. I used ClinicalTrials.gov to extract data on companies developing drugs and semi-automate the process of loading it into my knowledge graph. The technology stack used was FileMaker Pro Advanced, KNIME for ETL, and Neo4j Desktop for storing the knowledge graph. Cypher allows me to answer questions through APOC procedures and visualize data in Tableau or Neo4j Bloom. I demonstrate navigating my knowledge graph with an example of a molecular target that has been resistant to being targeted by a drug. Building a knowledge graph opens doors to capturing complex data landscapes, gaining additional insights, and exploring new opportunities.