Visually Inspecting Data Profiles for Data Distribution Shifts
The text discusses the importance of monitoring data distribution shifts in machine learning models. It explains that real-world data is constantly changing and can lead to model decay if not addressed. Data distribution shift issues, such as changes in input or output data, can affect a model's performance over time. The tutorial provides an example using UCI's Wine Quality Dataset to demonstrate how to inspect and detect distribution shifts with the help of whylogs, an open-source tool for ML monitoring. It also discusses various methods like comparing distribution metrics, applying statistical tests, and visually inspecting histograms to tackle data distribution shifts.
Company
WhyLabs
Date published
June 28, 2022
Author(s)
Felipe Adachi
Word count
1476
Language
English
Hacker News points
None found.