
TL;DR
This paper investigates the minimum data requirements for stochastic convex optimization when only approximate gradient information is available, revealing that more samples are needed than classical bounds suggest due to inaccuracies.
Contribution
It provides lower bounds on sample complexity for gradient-based optimization under noisy and potentially malicious gradient estimates, highlighting the necessity of inaccurate gradients for optimal rates.
Findings
General analyst requires (1/\u03b5^3) samples
Oracle with certain assumptions needs (1/^{2.5}) samples
Classical bounds are optimistic compared to these lower bounds
Abstract
The study of adaptive data analysis examines how many statistical queries can be answered accurately using a fixed dataset while avoiding false discoveries (statistically inaccurate answers). In this paper, we tackle a question that precedes the field of study: Is data only valuable when it provides accurate answers to statistical queries? To answer this question, we use Stochastic Convex Optimization as a case study. In this model, algorithms are considered as analysts who query an estimate of the gradient of a noisy function at each iteration and move towards its minimizer. It is known that examples can be used to minimize the objective function, but none of the existing methods depend on the accuracy of the estimated gradients along the trajectory. Therefore, we ask: How many samples are needed to minimize a noisy convex function if we require -accurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsNetwork On Network
