PRISM: Differentially Private Synthetic Data with Structure-Aware Budget Allocation for Prediction
Amir Asiaee, Chao Yan, Zachary B. Abrams, Bradley A. Malin

TL;DR
PRISM is a novel differentially private synthetic data generation method that optimizes feature selection and noise allocation based on structural knowledge to improve prediction accuracy under privacy constraints.
Contribution
It introduces a prediction-centric, structure-aware approach for DP synthetic data, formalizes the mechanism, and demonstrates improved accuracy over generic methods.
Findings
Task-aware allocation enhances prediction accuracy.
Targeting causal parents improves AUC under distribution shift.
PRISM outperforms generic synthesizers in predictive tasks.
Abstract
Differential privacy (DP) provides a mathematical guarantee limiting what an adversary can learn about any individual from released data. However, achieving this protection typically requires adding noise, and noise can accumulate when many statistics are measured. Existing DP synthetic data methods treat all features symmetrically, spreading noise uniformly even when the data will serve a specific prediction task. We develop a prediction-centric approach operating in three regimes depending on available structural knowledge. In the causal regime, when the causal parents of are known and distribution shift is expected, we target the parents for robustness. In the graphical regime, when a Bayesian network structure is available and the distribution is stable, the Markov blanket of provides a sufficient feature set for optimal prediction. In the predictive regime, when no…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Cryptography and Data Security
