Practical Privacy with Synthetic Data
This post discusses the implementation of a practical attack on synthetic data models to measure unintended memorization in neural network models, as described in "Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks" by Nicholas Carlini et al. The authors use this attack to evaluate how well synthetic data models with various neural network and differential privacy parameter settings protect sensitive data and secrets in datasets. They work with a smaller dataset containing sensitive location data, which is considered challenging to anonymize. The authors insert canary values into the model's training data and measure each model's propensity to memorize and replay these canary values. Results show that differential privacy works well at preventing memorization of secrets across all tested configurations, while gradient clipping also effectively prevented any replay of canary values with only a small loss in model accuracy.
Company
Gretel.ai
Date published
April 27, 2021
Author(s)
Alex Watson
Word count
1003
Hacker News points
None found.
Language
English