Orthant Based Proximal Stochastic Gradient Method for   $\ell_1$-Regularized Optimization

Tianyi Chen; Tianyu Ding; Bo Ji; Guanyi Wang; Jing Tian; Yixin Shi,; Sheng Yi; Xiao Tu; Zhihui Zhu

arXiv:2004.03639·math.OC·July 24, 2020·5 cites

Orthant Based Proximal Stochastic Gradient Method for $\ell_1$-Regularized Optimization

Tianyi Chen, Tianyu Ding, Bo Ji, Guanyi Wang, Jing Tian, Yixin Shi,, Sheng Yi, Xiao Tu, Zhihui Zhu

PDF

Open Access 1 Repo

TL;DR

This paper introduces OBProx-SG, a novel stochastic optimization method that enhances sparsity and convergence in l1-regularized problems, outperforming existing methods in both convex and non-convex machine learning tasks.

Contribution

The paper proposes a new orthant-based stochastic gradient method that improves sparsity promotion and convergence guarantees for l1-regularized optimization problems.

Findings

01

OBProx-SG converges to global optima or stationary points.

02

It significantly enhances sparsity compared to existing methods.

03

It achieves higher sparsity in deep neural networks without accuracy loss.

Abstract

Sparsity-inducing regularization problems are ubiquitous in machine learning applications, ranging from feature selection to model compression. In this paper, we present a novel stochastic method -- Orthant Based Proximal Stochastic Gradient Method (OBProx-SG) -- to solve perhaps the most popular instance, i.e., the l1-regularized problem. The OBProx-SG method contains two steps: (i) a proximal stochastic gradient step to predict a support cover of the solution; and (ii) an orthant step to aggressively enhance the sparsity level via orthant face projection. Compared to the state-of-the-art methods, e.g., Prox-SG, RDA and Prox-SVRG, the OBProx-SG not only converges to the global optimal solutions (in convex scenario) or the stationary points (in non-convex scenario), but also promotes the sparsity of the solutions substantially. Particularly, on a large number of convex problems,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tianyic/obproxsg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Machine Learning and ELM

MethodsFeature Selection · Depthwise Convolution · Pointwise Convolution · Average Pooling · Global Average Pooling · Depthwise Separable Convolution · 1x1 Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Dense Connections