Company
Date Published
Author
Zach Blumenfeld
Word count
1837
Language
English
Hacker News points
None

Summary

This section explores how to apply graph machine learning to predict high fraud risk user accounts using Neo4j and Graph Data Science. The motivation behind this approach is to proactively detect fraudulent actors ahead of time, measure performance, automate the prediction of fraud risk accounts, and improve understanding of fraud patterns. The feature engineering strategy involves building features from previous analysis, including community indicators and size, PageRank on P2P with shared card degree, degree centrality on the shared id rule, other useful centrality features, and exporting these features to Python for training and evaluation of an ML model. A random forest classifier is used as a starting point, but other classifiers can be explored. The analysis demonstrates promising results with an accuracy of around 85% and provides insights into unlabeled high-probability fraud risk predictions, which require subject matter expert review and iteration to improve predictive performance.