Synthetic Data Privacy Metrics

Amy Steier; Lipika Ramaswamy; Andre Manoel; Alexa Haushalter

arXiv:2501.03941·cs.LG·January 8, 2025

Synthetic Data Privacy Metrics

Amy Steier, Lipika Ramaswamy, Andre Manoel, Alexa Haushalter

PDF

Open Access

TL;DR

This paper reviews various privacy metrics for synthetic data, emphasizing the need for standardization and discussing methods to improve privacy in generative models, crucial for secure AI applications.

Contribution

It provides a comprehensive review of existing privacy metrics and best practices for enhancing privacy in synthetic data generation, addressing the lack of standardization.

Findings

01

Analysis of pros and cons of popular privacy metrics

02

Discussion of adversarial attack simulations for privacy assessment

03

Overview of techniques like differential privacy for model enhancement

Abstract

Recent advancements in generative AI have made it possible to create synthetic datasets that can be as accurate as real-world data for training AI models, powering statistical insights, and fostering collaboration with sensitive datasets while offering strong privacy guarantees. Effectively measuring the empirical privacy of synthetic data is an important step in the process. However, while there is a multitude of new privacy metrics being published every day, there currently is no standardization. In this paper, we review the pros and cons of popular metrics that include simulations of adversarial attacks. We also review current best practices for amending generative models to enhance the privacy of the data they create (e.g. differential privacy).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Digital and Cyber Forensics · Big Data Technologies and Applications