Date Published
Sept. 21, 2021
Veronica Zhai
Word count
Hacker News points


A data catalog is an inventory of all an organization's data assets, bundled with tools to maintain the catalog. It not only enumerates the data but also describes it, including database and table names, column names, column descriptions, and access rights. Unlike a database schema that applies to data from a single source, a data catalog describes all the data in an organization across all data sources and repositories. A data catalog can perform automated discovery of data sets, build data dictionaries, maintain relationships between data sets, track data lineage, and provide a visual interface for analysts with search capabilities. It makes analytics easier for non-experts by helping them see what data is available, where it comes from, and what other data it's related to.