Company
Date Published
Feb. 15, 2012
Author
Jonathan Ellis
Word count
1064
Language
English
Hacker News points
None

Summary

Cassandra, a distributed database system, initially followed a "schemaless" data model similar to Google's Bigtable paper. However, as systems grew and matured, the lack of schema became a pain point. Starting with version 0.7, Cassandra allowed users to define their data types, making it "schema-optional." The storage engine in Cassandra is sparse, allowing for easy addition of columns without reallocating space row by row. This flexibility is combined with the benefits of having a defined schema. In upcoming releases, CQL will support defining column families with compound primary keys, which can be useful for denormalizing data and faster queries.