Effective Proximal Methods for Non-convex Non-smooth Regularized Learning
Guannan Liang, Qianqian Tong, Jiahao Ding, Miao Pan, Jinbo Bi

TL;DR
This paper introduces stochastic proximal gradient methods with arbitrary sampling for non-convex, non-smooth regularized learning, providing convergence analysis and empirical evidence of improved speed over existing methods.
Contribution
It develops a unified analysis framework for stochastic proximal methods with arbitrary sampling, highlighting the benefits of independent sampling schemes.
Findings
Independent sampling improves performance over uniform sampling.
Tighter convergence bounds are derived for uniform sampling.
Proposed algorithms outperform state-of-the-art methods in convergence speed.
Abstract
Sparse learning is a very important tool for mining useful information and patterns from high dimensional data. Non-convex non-smooth regularized learning problems play essential roles in sparse learning, and have drawn extensive attentions recently. We design a family of stochastic proximal gradient methods by applying arbitrary sampling to solve the empirical risk minimization problem with a non-convex and non-smooth regularizer. These methods draw mini-batches of training examples according to an arbitrary probability distribution when computing stochastic gradients. A unified analytic approach is developed to examine the convergence and computational complexity of these methods, allowing us to compare the different sampling schemes. We show that the independent sampling scheme tends to improve performance over the commonly-used uniform sampling scheme. Our new analysis also derives…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning
