/plushcap/analysis/confluent/confluent-shift-left-bad-data-in-event-streams-part-1

Shift Left: Bad Data in Event Streams, Part 1

What's this blog post about?

Bad data is any data that doesn't conform to what is expected, including malformed or corrupted data. In event streams, bad data can cause serious issues and outages for all downstream data users. The strategies for mitigating and fixing bad data in streams include prevention, event design, and rewind, rebuild, and retry. Prevention is the most effective strategy and involves using schemas, testing, and validation rules to prevent bad data from entering the system. Event design allows issuing corrections by overwriting previous bad data, while rewind, rebuild, and retry can be used when all else fails. Bad data can creep into data sets in various ways, but good data practices such as prevention are crucial for dealing with it effectively.

Company
Confluent

Date published
Oct. 4, 2024

Author(s)
-

Word count
4397

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.