Company
Date Published
March 19, 2024
Author
Olivier Daneau
Word count
2643
Language
English
Hacker News points
None

Summary

SnowPatrol is a Snowflake anomaly detection and cost management application powered by Machine Learning and Airflow. It aims to help users proactively identify abnormal usage and simplify root-cause analysis and remediation. The solution uses an Isolation Forest model to detect anomalies in Snowflake usage, with data exploration, feature engineering, model training, and prediction managed through distinct Airflow workflows. Data-aware scheduling and dynamic task mapping are used to optimize resource utilization and adapt to changing requirements. The application also leverages Weights and Biases for experiment and model tracking, and is designed to be scalable, flexible, and customizable. By automating anomaly detection and alerting, SnowPatrol helps users avoid overages, reduce their Snowflake costs, and improve their data engineering practices.