Active Linear Regression for $\ell_p$ Norms and Beyond
Cameron Musco, Christopher Musco, David P. Woodruff, Taisuke Yasuda

TL;DR
This paper develops optimal active sampling algorithms for $\, ext{ell}_p$ norm linear regression across all $p$ ranges, improving sample complexity bounds and extending to robust and polynomial growth loss functions, with applications in subspace approximation and dimension reduction.
Contribution
It introduces the first near-optimal active sampling bounds for $\, ext{ell}_p$ regression for all $p$, and establishes new sensitivity bounds for polynomial loss functions, enabling efficient algorithms for robust regression.
Findings
Optimal query complexity bounds for $0<p<2$ and $2<p<\infty$ $\, ext{ell}_p$ regression.
First total sensitivity bound for polynomial growth loss functions.
Sublinear time algorithms for Kronecker product regression under all $p$ norms.
Abstract
We study active sampling algorithms for linear regression, which aim to query only a few entries of a target vector and output a near minimizer to , for a design matrix and loss . For norm regression for any , we give an algorithm based on Lewis weight sampling outputting a -approximate solution using just queries to for , queries for , and queries for . For , our bounds are optimal up to log factors, settling the query complexity for this range. For , our dependence on is optimal, while our dependence on is off by at most , up to log factors. Our result resolves an open question of [CD21], who gave near…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Markov Chains and Monte Carlo Methods · Complexity and Algorithms in Graphs
MethodsHuber loss
