Improved Generalization Guarantees in Restricted Data Models
Elbert Du, Cynthia Dwork

TL;DR
This paper demonstrates that in data models with weakly correlated distant attributes, differential privacy can be more effectively reused across data segments, leading to improved accuracy and generalization guarantees.
Contribution
It introduces a novel approach to reusing privacy budget in weakly correlated data models, enhancing privacy-utility trade-offs in differential privacy.
Findings
Reusing privacy budget improves accuracy in weakly correlated data models.
Weak correlations enable better privacy-utility trade-offs.
Enhanced generalization guarantees under the proposed data model.
Abstract
Differential privacy is known to protect against threats to validity incurred due to adaptive, or exploratory, data analysis -- even when the analyst adversarially searches for a statistical estimate that diverges from the true value of the quantity of interest on the underlying population. The cost of this protection is the accuracy loss incurred by differential privacy. In this work, inspired by standard models in the genomics literature, we consider data models in which individuals are represented by a sequence of attributes with the property that where distant attributes are only weakly correlated. We show that, under this assumption, it is possible to "re-use" privacy budget on different portions of the data, significantly improving accuracy without increasing the risk of overfitting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Statistical Methods and Inference · Advanced Causal Inference Techniques
