Sampling Without Compromising Accuracy in Adaptive Data Analysis
Benjamin Fish, Lev Reyzin, Benjamin I. P. Rubinstein

TL;DR
This paper introduces sampling techniques that significantly speed up adaptive data analysis mechanisms while maintaining accuracy, enabling efficient responses to large-scale, adaptive queries without additional data requirements.
Contribution
The authors present novel sampling-based mechanisms that accelerate adaptive query answering with provable accuracy guarantees, applicable to arbitrary statistical queries and convex optimization.
Findings
Polynomial speed-up per query over previous methods
Achieves meaningful responses with constant samples per query
Unified approach for optimizing convex functions
Abstract
In this work, we study how to use sampling to speed up mechanisms for answering adaptive queries into datasets without reducing the accuracy of those mechanisms. This is important to do when both the datasets and the number of queries asked are very large. In particular, we describe a mechanism that provides a polynomial speed-up per query over previous mechanisms, without needing to increase the total amount of data required to maintain the same generalization error as before. We prove that this speed-up holds for arbitrary statistical queries. We also provide an even faster method for achieving statistically-meaningful responses wherein the mechanism is only allowed to see a constant number of samples from the data per query. Finally, we show that our general results yield a simple, fast, and unified approach for adaptively optimizing convex and strongly convex functions over a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Distributed Sensor Networks and Detection Algorithms · Mobile Crowdsensing and Crowdsourcing
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
