Using Neo4j Graph Data Science in Python to Improve Machine Learning Models

Company

Neo4j

Date Published

July 6, 2022

Author

Tomaž Bratanič

Word count

2929

Language

English

Hacker News points

None

URL

neo4j.com/blog/developer/using-neo4j-graph-data-science-in-python-to-improve-machine-learning-models

Summary

The text discusses using Neo4j Graph Data Science in Python to improve machine learning models. The author provides a simple demonstration of how graph-based features can increase the accuracy of a machine learning model, specifically in a fraud detection scenario. They use an anonymized dataset from a P2P payment platform and train a baseline classification model based on non-graph-based features. Then, they explore graph-based features such as PageRank centrality and community detection to improve the model's accuracy. The author finds that incorporating these features results in a more accurate model, with improved performance metrics including higher AUC scores and reduced misclassification rates. They conclude by emphasizing the importance of exploring relationships between data points in datasets to extract predictive graph-based features for downstream machine learning tasks.