Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling

Zeyuan Allen-Zhu; Zheng Qu; Peter Richt\'arik; Yang Yuan

arXiv:1512.09103·math.OC·May 30, 2016·22 cites

Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling

Zeyuan Allen-Zhu, Zheng Qu, Peter Richt\'arik, Yang Yuan

PDF

Open Access

TL;DR

This paper introduces a novel non-uniform sampling method for accelerated coordinate descent, significantly improving its speed by up to a factor of in large-scale optimization tasks like machine learning and linear systems.

Contribution

It presents a new non-uniform sampling scheme based on coordinate smoothness, achieving faster convergence rates for accelerated coordinate descent algorithms.

Findings

01

Achieves up to speed-up in running time.

02

Effective in empirical risk minimization and linear system solving.

03

Validates improvements through theoretical analysis and practical experiments.

Abstract

Accelerated coordinate descent is widely used in optimization due to its cheap per-iteration cost and scalability to large-scale problems. Up to a primal-dual transformation, it is also the same as accelerated stochastic gradient descent that is one of the central methods used in machine learning. In this paper, we improve the best known running time of accelerated coordinate descent by a factor up to $n$ . Our improvement is based on a clean, novel non-uniform sampling that selects each coordinate with a probability proportional to the square root of its smoothness parameter. Our proof technique also deviates from the classical estimation sequence technique used in prior work. Our speed-up applies to important problems such as empirical risk minimization and solving linear systems, both in theory and in practice.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Face and Expression Recognition