Using logs to build a solid data infrastructure (or: why dual writes are a bad idea)
Logs are a fundamental concept in computer science, used for various purposes such as storage engines, database replication, distributed consensus, and message brokering. They provide an append-only sequence of records that can be read sequentially, ensuring consistency and reliability. In this talk, we discussed four examples of practical applications of logs: storage engines (B-trees), database replication, distributed consensus (Raft algorithm), and Apache Kafka as a message broker. We also explored how to use logs for data integration by maintaining a log of writes and consuming it in sequential order to keep different datastores in sync. This approach helps avoid race conditions and partial failures, ensuring consistency across various systems.
Company
Confluent
Date published
May 29, 2015
Author(s)
Lucia Cerchie, Martin Kleppmann, Josep Prat
Word count
6354
Language
English
Hacker News points
4