Calibrating Noise to Variance in Adaptive Data Analysis
Vitaly Feldman, Thomas Steinke

TL;DR
This paper introduces a relaxed stability notion for adaptive data analysis, enabling more accurate statistical query answers, especially for low-variance queries, by calibrating noise to query variance.
Contribution
It proposes a new stability framework that allows for adaptive composition and improves accuracy guarantees for low-variance queries using simple noise calibration.
Findings
Enhanced accuracy for low-variance queries
Simple noise scaling algorithm based on query standard deviation
Improved over previous median-of-means approaches
Abstract
Datasets are often used multiple times and each successive analysis may depend on the outcome of previous analyses. Standard techniques for ensuring generalization and statistical validity do not account for this adaptive dependence. A recent line of work studies the challenges that arise from such adaptive data reuse by considering the problem of answering a sequence of "queries" about the data distribution where each query may depend arbitrarily on answers to previous queries. The strongest results obtained for this problem rely on differential privacy -- a strong notion of algorithmic stability with the important property that it "composes" well when data is reused. However the notion is rather strict, as it requires stability under replacement of an arbitrary data element. The simplest algorithm is to add Gaussian (or Laplace) noise to distort the empirical answers. However,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Cryptography and Data Security
