Performance of $\ell_1$ Regularization for Sparse Convex Optimization
Kyriakos Axiotis, Taisuke Yasuda

TL;DR
This paper provides the first theoretical guarantees for the Group LASSO in sparse convex optimization with vector features, explaining its empirical success and connecting it to feature selection algorithms like Orthogonal Matching Pursuit.
Contribution
It establishes recovery guarantees for Group LASSO under restricted strong convexity and smoothness, extending theoretical understanding beyond statistical settings.
Findings
Recovery guarantees for Group LASSO with vector features.
Equivalence of Group LASSO and Orthogonal Matching Pursuit in feature selection.
New results for column subset selection with general loss functions.
Abstract
Despite widespread adoption in practice, guarantees for the LASSO and Group LASSO are strikingly lacking in settings beyond statistical problems, and these algorithms are usually considered to be a heuristic in the context of sparse convex optimization on deterministic inputs. We give the first recovery guarantees for the Group LASSO for sparse convex optimization with vector-valued features. We show that if a sufficiently large Group LASSO regularization is applied when minimizing a strictly convex function , then the minimizer is a sparse vector supported on vector-valued features with the largest norm of the gradient. Thus, repeating this procedure selects the same set of features as the Orthogonal Matching Pursuit algorithm, which admits recovery guarantees for any function with restricted strong convexity and smoothness via weak submodularity arguments. This answers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Machine Learning and ELM
MethodsFeature Selection
