Stochastic Primal-Dual Method for Empirical Risk Minimization with   $\mathcal{O}(1)$ Per-Iteration Complexity

Conghui Tan; Tong Zhang; Shiqian Ma; Ji Liu

arXiv:1811.01182·math.OC·November 6, 2018·1 cites

Stochastic Primal-Dual Method for Empirical Risk Minimization with $\mathcal{O}(1)$ Per-Iteration Complexity

Conghui Tan, Tong Zhang, Shiqian Ma, Ji Liu

PDF

Open Access

TL;DR

This paper introduces a stochastic primal-dual algorithm for empirical risk minimization that achieves constant per-iteration complexity and includes a variance-reduction variant with linear convergence, outperforming existing methods in high-dimensional settings.

Contribution

The paper presents a novel stochastic primal-dual method with O(1) per-iteration complexity and a variance-reduction version that converges linearly, advancing optimization efficiency in machine learning.

Findings

01

Our methods are faster than proximal SGD, SVRG, and SAGA on high-dimensional problems.

02

The proposed algorithms require only constant operations per iteration.

03

The variance-reduction variant achieves linear convergence.

Abstract

Regularized empirical risk minimization problem with linear predictor appears frequently in machine learning. In this paper, we propose a new stochastic primal-dual method to solve this class of problems. Different from existing methods, our proposed methods only require O(1) operations in each iteration. We also develop a variance-reduction variant of the algorithm that converges linearly. Numerical experiments suggest that our methods are faster than existing ones such as proximal SGD, SVRG and SAGA on high-dimensional problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Complexity and Algorithms in Graphs

MethodsSAGA · Stochastic Gradient Descent