Generalization in the Face of Adaptivity: A Bayesian Perspective
Moshe Shenfeld, Katrina Ligett

TL;DR
This paper demonstrates that simple noise-addition algorithms can effectively prevent overfitting in adaptive data analysis, providing variance-dependent guarantees and a new stability measure based on covariance with a Bayes factor.
Contribution
It introduces a novel characterization of adaptive overfitting, showing that straightforward noise algorithms suffice, and develops a data-dependent stability notion to bound information leakage.
Findings
Noise addition algorithms provide variance-dependent guarantees.
Adaptive overfitting is linked to covariance with a Bayes factor measure.
A new stability notion bounds information leakage in adaptive analysis.
Abstract
Repeated use of a data sample via adaptively chosen queries can rapidly lead to overfitting, wherein the empirical evaluation of queries on the sample significantly deviates from their mean with respect to the underlying data distribution. It turns out that simple noise addition algorithms suffice to prevent this issue, and differential privacy-based analysis of these algorithms shows that they can handle an asymptotically optimal number of queries. However, differential privacy's worst-case nature entails scaling such noise to the range of the queries even for highly-concentrated queries, or introducing more complex algorithms. In this paper, we prove that straightforward noise-addition algorithms already provide variance-dependent guarantees that also extend to unbounded queries. This improvement stems from a novel characterization that illuminates the core problem of adaptive data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Distributed Sensor Networks and Detection Algorithms
