How we built a new powerful JSON data type for ClickHouse
The article discusses the implementation of a new JSON data type for ClickHouse, an analytical database known for its speed and efficiency. This new JSON data type addresses several challenges that come with handling JSON data at scale, such as true column-oriented storage, dynamically changing data without type unification, prevention of avalanche of column data files on disk, and dense storage. The Variant and Dynamic types were first introduced as building blocks for the new JSON data type. The Variant type allows efficient storage of values with different data types within the same table column without any unification into a least common type. The Dynamic type can store values of any data type inside a single table column without knowing and having to specify all the types in advance, and it supports limiting the number of types that are stored as separate column data files. The new JSON data type allows for storage of JSON objects with any structure and reading every JSON value from it using the JSON path as a subcolumn. It also supports reading nested JSON objects as subcolumns with type JSON using special syntax JSON_column.^some.path. The article concludes by stating that this new JSON data type is currently released as experimental for testing purposes, and its feature set will be expanded in future updates.
Company
ClickHouse
Date published
Oct. 22, 2024
Author(s)
Pavel Kruglov
Word count
3934
Language
English
Hacker News points
382