On efficient robust regression with subquadratic samples

Deeksha Adil; Jaros{\l}aw B{\l}asiok; Hongjie Chen; Deepak Narayanan Sridharan

arXiv:2605.18042·cs.DS·May 19, 2026

On efficient robust regression with subquadratic samples

Deeksha Adil, Jaros{\l}aw B{\l}asiok, Hongjie Chen, Deepak Narayanan Sridharan

PDF

TL;DR

This paper presents a near-linear-time algorithm for robust linear regression with Gaussian covariates, achieving improved prediction error bounds and analyzing fundamental trade-offs among sample complexity, condition number, and efficiency.

Contribution

It introduces a new efficient algorithm with near-linear runtime that improves prediction error bounds and provides lower bounds illustrating the limits of efficient algorithms.

Findings

01

Algorithm uses $ ilde{O}(d/ ext{epsilon}^4)$ samples and achieves $O( oot ext{sqrt}( ext{epsilon} ext{kappa}))$ error.

02

SQ lower bounds show that achieving error below $O( oot ext{sqrt}( ext{epsilon} ext{kappa}))$ requires $ ext{Omega}(d^2)$ samples.

03

Polynomial lower bounds suggest that without certain assumptions, algorithms need significantly more samples to outperform trivial estimators.

Abstract

We revisit the problem of robust linear regression under Gaussian covariates with an unknown covariance matrix of condition number $κ$ . For this fundamental problem, significant gaps remain in our understanding of the trade-offs among sample complexity, condition number, runtime, and prediction error for efficient algorithms. Our first result is a near-linear-time algorithm that uses $O (d / ϵ^{4})$ samples, where $d$ is the dimension and $ϵ$ is the corruption rate, and achieves prediction error $O (ϵ κ)$ under the condition $ϵ κ ≲ 1$ , improving over all prior works. We complement this result with a Statistical Query (SQ) lower bound showing that efficient SQ algorithms achieving error $o (ϵ κ)$ when $ϵ κ ≲ 1$ require queries that take $Ω (d^{2})$ samples to simulate. Finally, we prove a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.