Company
Date Published
April 24, 2023
Author
Lipika Ramaswamy
Word count
1796
Language
English
Hacker News points
None

Summary

Gretel Tabular DP is a new model that generates high quality tabular synthetic data with mathematical guarantees of privacy. It's a differentially private graph-based generative model that creates synthetic versions of sensitive data, offering provable mathematical guarantees of privacy. The model works well on datasets with primarily categorical variables, relatively low cardinality (<100 unique categories per variable) and under 100 variables. It follows the select-measure-generate paradigm developed by McKenna et al., which involves selecting a subset of correlated pairs of variables using a differentially private algorithm, measuring distributions of the selected pairs with differential privacy, and estimating a probabilistic graphical model that captures the relationship as described by the noisy marginals.