Convergence of Random Reshuffling Under The Kurdyka-{\L}ojasiewicz   Inequality

Xiao Li; Andre Milzarek; and Junwen Qiu

arXiv:2110.04926·math.OC·January 26, 2023·1 cites

Convergence of Random Reshuffling Under The Kurdyka-{\L}ojasiewicz Inequality

Xiao Li, Andre Milzarek, and Junwen Qiu

PDF

Open Access

TL;DR

This paper proves that the random reshuffling method for nonconvex optimization converges to a stationary point under the Kurdyka-Lojasiewicz inequality, providing convergence rates depending on the KL exponent.

Contribution

It introduces a novel convergence analysis framework for the non-descent RR method with diminishing step sizes under the KL inequality, extending existing theories.

Findings

01

Convergence to a stationary point is guaranteed under KL inequality.

02

Convergence rates depend on the KL exponent, with explicit bounds.

03

The analysis framework applies to reshuffled proximal point methods.

Abstract

We study the random reshuffling (RR) method for smooth nonconvex optimization problems with a finite-sum structure. Though this method is widely utilized in practice such as the training of neural networks, its convergence behavior is only understood in several limited settings. In this paper, under the well-known Kurdyka-Lojasiewicz (KL) inequality, we establish strong limit-point convergence results for RR with appropriate diminishing step sizes, namely, the whole sequence of iterates generated by RR is convergent and converges to a single stationary point in an almost sure sense. In addition, we derive the corresponding rate of convergence, depending on the KL exponent and the suitably selected diminishing step sizes. When the KL exponent lies in $[0, \frac{1}{2}]$ , the convergence is at a rate of $O (t^{- 1})$ with $t$ counting the iteration number. When the KL exponent belongs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research