Company
Date Published
Author
Tom Nijhof
Word count
1575
Language
English
Hacker News points
None

Summary

The author of the blog post implements a collaborative filtering algorithm in a graph database to predict chemical-cell interaction, specifically GI50 measurements. The dataset used is the NCI60 dataset, which contains information on the growth inhibition of 60 cell lines by various chemicals. The author simplifies the graph to focus only on two nodes: compounds and cell lines, with one relationship: GI50. The algorithm works by finding similar chemicals based on their shared cell lines and then voting between these similar ones on missing links. The prediction consists of three steps: finding similar chemicals, voting between them on missing links, and removing the chemical being predicted from the results. The author tests the algorithm using the NCI60 dataset and compares the predictions with actual GI50 measurements. While not perfect, the results show promise, especially when considering the smaller range of predicted values compared to all HCT-15 GI50 measurements.