/plushcap/analysis/clickhouse/clickhouse-1-trillion-row-challenge

ClickHouse and The One Trillion Row Challenge

What's this blog post about?

The text discusses the successful completion of a challenge set by Gunnar Morling from Decodable, which required users to write a Java program to compute each city's minimum, average, and maximum temperatures from a text file containing 1 billion measurements. Using ClickHouse, the response time was around 19 seconds using the exact hardware profile stipulated by the rules. The challenge was later extended to 1 trillion rows by Dask, prompting the authors to attempt querying this larger dataset. They achieved this in under 3 minutes for $0.56 using spot instances in AWS and a ClickHouse cluster. The text also provides details on how they optimized their settings and queries to achieve these results.

Company
ClickHouse

Date published
March 5, 2024

Author(s)
Dale McDiarmid

Word count
3767

Hacker News points
14

Language
English


By Matt Makai. 2021-2024.