Company
Date Published
May 31, 2019
Author
Kevin Hu
Word count
1621
Language
English
Hacker News points
None

Summary

Plaid's API helps developers provide financial services to millions of consumers across North America. One challenge they faced was reconciling pending and posted transactions, which is crucial for notifying customers about new transactions without duplication. To solve this problem, Plaid initially used decision trees but later switched to a random forest model that combined bagging and feature sampling. However, the random forest model had a high false negative rate due to imbalanced datasets. They then implemented boosting, which significantly improved their model's performance by reducing the false negative rate by 96%. The new boosting model provides higher quality transactions data to clients and consumers, resulting in fewer support tickets filed about pending-to-posted transaction matching.