Visually Inspecting Data Profiles for Data Distribution Shifts

Company

WhyLabs

Date Published

June 28, 2022

Author

Felipe Adachi

Word count

1476

Language

English

Hacker News points

None

URL

whylabs.ai/blog/posts/visually-inspecting-data-profiles-for-data-distribution-shifts

Summary

The text discusses the importance of monitoring data distribution shifts in machine learning models. It explains that real-world data is constantly changing and can lead to model decay if not addressed. Data distribution shift issues, such as changes in input or output data, can affect a model's performance over time. The tutorial provides an example using UCI's Wine Quality Dataset to demonstrate how to inspect and detect distribution shifts with the help of whylogs, an open-source tool for ML monitoring. It also discusses various methods like comparing distribution metrics, applying statistical tests, and visually inspecting histograms to tackle data distribution shifts.