Coresets for Multiple $\ell_p$ Regression
David P. Woodruff, Taisuke Yasuda

TL;DR
This paper introduces nearly size-independent coresets for multiple $ ext{ell}_p$ regression, enabling efficient approximation of regression objectives and subspace problems for various $p$ values, with tight bounds and practical applications.
Contribution
It constructs nearly dimension-free coresets for multiple $ ext{ell}_p$ regression, improving prior bounds and providing tight theoretical guarantees for various $p$ and applications.
Findings
Coresets of size $ ilde O( ext{poly}(1/ extvarepsilon), d)$ for $p<2$ and $p>2$ are nearly tight.
Sample complexity for $ ext{ell}_p$ Euclidean power means is established as tight for different $p$.
Existence of small subsets of rows spanning nearly optimal $ ext{ell}_p$ subspaces for $1<p<2$ is proven.
Abstract
A coreset of a dataset with examples and features is a weighted subset of examples that is sufficient for solving downstream data analytic tasks. Nearly optimal constructions of coresets for least squares and linear regression with a single response are known in prior work. However, for multiple regression where there can be responses, there are no known constructions with size sublinear in . In this work, we construct coresets of size for and for independently of (i.e., dimension-free) that approximate the multiple regression objective at every point in the domain up to relative error. If we only need to preserve the minimizer subject to a subspace constraint, we improve these bounds by an factor for all . All of our bounds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition
MethodsCoresets · Linear Regression
