Company
Date Published
Author
Warnessa Weaver, Tom Shen, Joshua Johnson
Word count
1323
Language
English
Hacker News points
None

Summary

We've developed a self-improving AI-powered algorithm that adapts to an organization's unique traffic patterns to reduce false positives in Cloudflare's Data Loss Prevention (DLP) solution. This algorithm, built into the DLP Engine, uses a pretrained language model to convert text into high-dimensional vectors, capturing the meaning of the text and ensuring that similar sentences with different wording map to close vectors. The system then performs a nearest neighbor search to find previously logged false or true positives with similar meanings, allowing it to identify context similarities even if the exact wording differs. This approach has proven robust in handling new pattern matches and reducing false positives over time. The solution is seamlessly integrated with Cloudflare's developer platform, including Workers AI and Vectorize, simplifying its design and focusing on the algorithm itself without the overhead of provisioning underlying resources.