On efficient robust regression with subquadratic samples
Deeksha Adil, Jaros{\l}aw B{\l}asiok, Hongjie Chen, Deepak Narayanan Sridharan

TL;DR
This paper presents a near-linear-time algorithm for robust linear regression with Gaussian covariates, achieving improved prediction error bounds and analyzing fundamental trade-offs among sample complexity, condition number, and efficiency.
Contribution
It introduces a new efficient algorithm with near-linear runtime that improves prediction error bounds and provides lower bounds illustrating the limits of efficient algorithms.
Findings
Algorithm uses $ ilde{O}(d/ ext{epsilon}^4)$ samples and achieves $O( oot ext{sqrt}( ext{epsilon} ext{kappa}))$ error.
SQ lower bounds show that achieving error below $O( oot ext{sqrt}( ext{epsilon} ext{kappa}))$ requires $ ext{Omega}(d^2)$ samples.
Polynomial lower bounds suggest that without certain assumptions, algorithms need significantly more samples to outperform trivial estimators.
Abstract
We revisit the problem of robust linear regression under Gaussian covariates with an unknown covariance matrix of condition number . For this fundamental problem, significant gaps remain in our understanding of the trade-offs among sample complexity, condition number, runtime, and prediction error for efficient algorithms. Our first result is a near-linear-time algorithm that uses samples, where is the dimension and is the corruption rate, and achieves prediction error under the condition , improving over all prior works. We complement this result with a Statistical Query (SQ) lower bound showing that efficient SQ algorithms achieving error when require queries that take samples to simulate. Finally, we prove a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
