Tight Bounds for Answering Adaptively Chosen Concentrated Queries
Emma Rapoport, Edith Cohen, Uri Stemmer

TL;DR
This paper establishes fundamental limits on answering adaptively chosen concentrated queries with correlated data, showing the existing utility gap is unavoidable and providing simplified algorithms that match these bounds.
Contribution
The paper proves the inherent utility gap in the concentrated queries framework under natural conditions and offers simplified algorithms that achieve these bounds.
Findings
The utility gap between adaptive and non-adaptive settings is unavoidable.
Existing algorithms are essentially optimal under the current framework.
Simplified algorithms match the proven impossibility bounds.
Abstract
Most work on adaptive data analysis assumes that samples in the dataset are independent. When correlations are allowed, even the non-adaptive setting can become intractable, unless some structural constraints are imposed. To address this, Bassily and Freund [2016] introduced the elegant framework of concentrated queries, which requires the analyst to restrict itself to queries that are concentrated around their expected value. While this assumption makes the problem trivial in the non-adaptive setting, in the adaptive setting it remains quite challenging. In fact, all known algorithms in this framework support significantly fewer queries than in the independent case: At most queries for a sample of size , compared to in the independent setting. In this work, we prove that this utility gap is inherent under the current formulation of the concentrated queries…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Advanced Database Systems and Queries · Privacy-Preserving Technologies in Data
