ClickHouse and The One Trillion Row Challenge
The text discusses the successful completion of a challenge set by Gunnar Morling from Decodable, which required users to write a Java program to compute each city's minimum, average, and maximum temperatures from a text file containing 1 billion measurements. Using ClickHouse, the response time was around 19 seconds using the exact hardware profile stipulated by the rules. The challenge was later extended to 1 trillion rows by Dask, prompting the authors to attempt querying this larger dataset. They achieved this in under 3 minutes for $0.56 using spot instances in AWS and a ClickHouse cluster. The text also provides details on how they optimized their settings and queries to achieve these results.
Company
ClickHouse
Date published
March 5, 2024
Author(s)
Dale McDiarmid
Word count
3767
Language
English
Hacker News points
14