Algorithmic Stability for Adaptive Data Analysis

Raef Bassily; Kobbi Nissim; Adam Smith; Thomas Steinke; Uri Stemmer,; Jonathan Ullman

arXiv:1511.02513·cs.LG·November 10, 2015

Algorithmic Stability for Adaptive Data Analysis

Raef Bassily, Kobbi Nissim, Adam Smith, Thomas Steinke, Uri Stemmer,, Jonathan Ullman

PDF

TL;DR

This paper advances understanding of adaptive data analysis by providing improved bounds on sample complexity for answering various query types, leveraging stability notions like differential privacy.

Contribution

It offers new, simplified upper bounds on sample size needed for adaptive queries, extending to low-sensitivity and optimization queries, and explores stability notions beyond differential privacy.

Findings

01

Improved bounds on sample complexity for statistical queries.

02

First bounds for low-sensitivity and optimization queries.

03

Extended stability analysis beyond differential privacy.

Abstract

Adaptivity is an important feature of data analysis---the choice of questions to ask about a dataset often depends on previous interactions with the same dataset. However, statistical validity is typically studied in a nonadaptive model, where all questions are specified before the dataset is drawn. Recent work by Dwork et al. (STOC, 2015) and Hardt and Ullman (FOCS, 2014) initiated the formal study of this problem, and gave the first upper and lower bounds on the achievable generalization error for adaptive data analysis. Specifically, suppose there is an unknown distribution $P$ and a set of $n$ independent samples $x$ is drawn from $P$ . We seek an algorithm that, given $x$ as input, accurately answers a sequence of adaptively chosen queries about the unknown distribution $P$ . How many samples $n$ must we draw from the distribution, as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.