Company
Date Published
Author
Jonathan Gomes Selman
Word count
529
Language
English
Hacker News points
None

Summary

The text discusses the importance of understanding when a machine learning model is going wrong or likely to go wrong in production. It highlights that quantifying how hard a given data sample is for a model to learn can help identify issues with the data, which can lead to poor results if ignored. The text identifies two main categories of reasons why some data may be difficult for models to understand: data errors and data limitations. Data error potential (DEP) score is introduced as a tool to quickly sort and bubble up data that is most difficult and worthwhile to explore when digging into model failures. This score provides an 'X-ray view' of the data, allowing data scientists to identify and fix issues with their models 10x faster than before. The text concludes by mentioning the tools available for exploring the DEP score, including Galileo's demo.