/plushcap/analysis/whylabs/whylabs-posts-data-logging-with-whylogs

Data Logging with whylogs

What's this blog post about?

whylogs is an open source tool for data logging that enables users to detect data drift, prevent ML model performance degradation, and validate the quality of their data. The v1 release brings a simpler API, new data constraints, new profile visualizations, faster performance, and a usability refresh. With whylogs, users can generate statistical summaries (termed whylogs profiles) from data as it flows through their data pipelines and into their machine learning models. These profiles enable users to track changes in their data over time, detecting data drift or data quality problems. The tool supports both tabular and complex data and runs natively in Python and JVM environments. It also supports batch processing (e.g., Apache Spark) and streaming (e.g., Apache Kafka). whylogs v1 is built for scale and optimized for massive data sets, with a more than 500x improvement in the speed of generating profiles for large datasets compared to the previous version.

Company
WhyLabs

Date published
May 31, 2022

Author(s)
WhyLabs Admin

Word count
1659

Language
English

Hacker News points
1


By Matt Makai. 2021-2024.