Integrating whylogs into your Kafka ML Pipeline
Whylogs is an open-source package for Python or Java that uses Apache DataSketches to monitor and detect statistical anomalies in streaming data. It can be integrated into various data pipelines, including Kafka, MLflow, SageMaker, and Spark Pipelines. The integration of whylogs with Kafka allows continuous monitoring of the entire data stream by producing compact statistical profiles of time series data that help detect data drift and distribution changes over time. This makes it easier to ensure data quality in real-time event-driven machine learning platforms.
Company
WhyLabs
Date published
April 7, 2021
Author(s)
Chris Warth,, Alessya Visnjic
Word count
1092
Hacker News points
1
Language
English