Optimal bounds for $\ell_p$ sensitivity sampling via $\ell_2$ augmentation
Alexander Munteanu, Simon Omlor

TL;DR
This paper introduces an improved sensitivity sampling method for $oldsymbol{ ext{ell}_p}$ subspace embeddings by augmenting $oldsymbol{ ext{ell}_p}$ sensitivities with $oldsymbol{ ext{ell}_2}$ sensitivities, achieving optimal linear sampling complexity for all $oldsymbol{p ext{ in } [1,2]}$.
Contribution
It demonstrates that augmenting $ ext{ell}_p$ sensitivities with $ ext{ell}_2$ sensitivities yields optimal sampling bounds, resolving an open question and improving previous bounds for $ ext{ell}_p$ subspace embeddings.
Findings
Achieved $ ilde O( ext{epsilon}^{-2}( ext{S}+d))$ sampling complexity for all p in [1,2].
Resolved an open question by Woodruff & Yasuda (2023c).
Provided improved bounds for logistic regression sensitivity sampling.
Abstract
Data subsampling is one of the most natural methods to approximate a massively large data set by a small representative proxy. In particular, sensitivity sampling received a lot of attention, which samples points proportional to an individual importance measure called sensitivity. This framework reduces in very general settings the size of data to roughly the VC dimension times the total sensitivity while providing strong guarantees on the quality of approximation. The recent work of Woodruff & Yasuda (2023c) improved substantially over the general bound for the important problem of subspace embeddings to for . Their result was subsumed by an earlier bound which was implicitly given in the work…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques
MethodsSparse Evolutionary Training
