/plushcap/analysis/whylabs/whylabs-posts-continuous-data-validation-with-whylogs-and-github-actions

Don’t Let Your Data Fail You; Continuous Data Validation with whylogs and Github Actions

What's this blog post about?

This article discusses the importance of ensuring data quality in machine learning (ML) pipelines and how whylogs can help with this purpose. It introduces the concept of constraints, which are rules created to assert that data lies within an expected range. These constraints can be applied to features of a dataset and organized so that one feature can have multiple constraints, and one constraint can be applied to multiple features. The article also demonstrates how whylogs can be integrated into Continuous Integration (CI) pipelines with the aid of Github Actions. It provides an overview of GitHub Actions and explains how they can be used to test data as part of a CI pipeline. Finally, it discusses some future features being considered for whylogs and highlights the importance of data validation in real-world scenarios.

Company
WhyLabs

Date published
July 20, 2021

Author(s)
WhyLabs Team

Word count
2486

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.