In this blog post, Michael Kaminsky discusses a common pitfall in data science when using cloud data warehouses for machine learning tasks. The issue arises with untracked slowly changing dimensions, which can lead to poor real-world predictive accuracy of models. An example is provided where the goal is to predict customer churn in a subscription business. Two main pitfalls are identified: incorrect setup and the problem of slowly changing dimensions. To address these issues, Kaminsky suggests using a log format for data storage that tracks all changes made to different values in the database. Fivetran's "history mode" feature is mentioned as a tool to automatically record and capture changes to slowly changing dimensions, allowing for better model generalization from training to production.