Joint Selection: Adaptively Incorporating Public Information for Private   Synthetic Data

Miguel Fuentes; Brett Mullins; Ryan McKenna; Gerome Miklau; Daniel; Sheldon

arXiv:2403.07797·cs.LG·March 13, 2024·1 cites

Joint Selection: Adaptively Incorporating Public Information for Private Synthetic Data

Miguel Fuentes, Brett Mullins, Ryan McKenna, Gerome Miklau, Daniel, Sheldon

PDF

Open Access 1 Repo

TL;DR

This paper introduces jam-pgm, a novel mechanism that adaptively combines public and private data in graphical models to improve synthetic data quality, even with biased public data.

Contribution

The paper develops jam-pgm, a new adaptive measurement framework that jointly selects public and private data for differentially private synthetic data generation.

Findings

01

Outperforms existing methods in synthetic data quality.

02

Effectively incorporates biased public data.

03

Enhances graphical-model-based mechanisms.

Abstract

Mechanisms for generating differentially private synthetic data based on marginals and graphical models have been successful in a wide range of settings. However, one limitation of these methods is their inability to incorporate public data. Initializing a data generating model by pre-training on public data has shown to improve the quality of synthetic data, but this technique is not applicable when model structure is not determined a priori. We develop the mechanism jam-pgm, which expands the adaptive measurements framework to jointly select between measuring public data and private data. This technique allows for public data to be included in a graphical-model-based mechanism. We show that jam-pgm is able to outperform both publicly assisted and non publicly assisted synthetic data generation mechanisms even when the public data distribution is biased.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

miguel-fuentes/jam_aistats
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Cryptography and Data Security

MethodsNetwork On Network