Markov Chain Block Coordinate Descent

Tao Sun; Yuejiao Sun; Yangyang Xu; Wotao Yin

arXiv:1811.08990·math.OC·November 26, 2018·1 cites

Markov Chain Block Coordinate Descent

Tao Sun, Yuejiao Sun, Yangyang Xu, Wotao Yin

PDF

Open Access

TL;DR

This paper introduces a Markov chain-based block coordinate descent method for large-scale optimization, proving its convergence and convergence rates for nonconvex and convex functions, with applications in distributed systems.

Contribution

It proposes a novel Markov chain block coordinate descent algorithm with convergence analysis, extending BCD methods to non-i.i.d. block selection scenarios.

Findings

01

Proves convergence of Markov chain BCD for Lipschitz differentiable functions.

02

Establishes sublinear and linear convergence rates for convex and strongly convex functions.

03

Introduces Markov chain inertial BCD and discusses potential applications.

Abstract

The method of block coordinate gradient descent (BCD) has been a powerful method for large-scale optimization. This paper considers the BCD method that successively updates a series of blocks selected according to a Markov chain. This kind of block selection is neither i.i.d. random nor cyclic. On the other hand, it is a natural choice for some applications in distributed optimization and Markov decision process, where i.i.d. random and cyclic selections are either infeasible or very expensive. By applying mixing-time properties of a Markov chain, we prove convergence of Markov chain BCD for minimizing Lipschitz differentiable functions, which can be nonconvex. When the functions are convex and strongly convex, we establish both sublinear and linear convergence rates, respectively. We also present a method of Markov chain inertial BCD. Finally, we discuss potential applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods