Databases Demystified Chapter 6 â Distributed Databases Part 1

Company

Fivetran

Date Published

Sept. 3, 2020

Author

Michael Kaminsky

Word count

1385

Language

English

Hacker News points

None

URL

www.fivetran.com/blog/databases-demystified-distributed-databases-part-1

Summary

Distributed and single-node databases differ in their architecture, functionality, and use cases. Distributed databases consist of multiple computers storing data, while single-node databases run on a single computer. Examples of distributed databases include Google Spanner, Azure Cosmos, Redshift, Snowflake, and BigQuery. Single-node databases include PostgreSQL, MySQL, and SQLite. Distributed databases were developed to address the need for storing large volumes of data, speeding up queries by utilizing multiple computers' computational power simultaneously, and ensuring resiliency in case of hardware or network failures. While bigger and better single-node computers can work up to a certain point, they have limitations in terms of cost, size, and fault tolerance. Distributed databases are made up of clusters consisting of nodes (individual computers). There are two main paradigms for distributed databases: big compute and high availability. Big compute involves splitting or sharding data across different nodes to process queries faster, while high-availability databases duplicate data on each node to ensure fault tolerance. In summary, distributed databases allow for more efficient storage and processing of large amounts of data, as well as increased resilience in the face of hardware or network failures.

Databases Demystified Chapter 6 â Distributed Databases Part 1

Summary

Databases Demystified Chapter 6 â Distributed Databases Part 1