Distributed Random Reshuffling over Networks
Kun Huang, Xiao Li, Andre Milzarek, Shi Pu, and Junwen Qiu

TL;DR
This paper introduces a distributed random reshuffling algorithm for networked optimization, achieving fast convergence rates for both convex and nonconvex problems, matching centralized methods and outperforming distributed SGD.
Contribution
The paper proposes the D-RR algorithm, extending random reshuffling to distributed settings with proven convergence rates for convex and nonconvex functions.
Findings
D-RR achieves an $oldsymbol{O}(1/T^2)$ convergence rate for strongly convex functions.
D-RR attains an $oldsymbol{O}(1/T^{2/3})$ rate for nonconvex functions.
Numerical experiments confirm the efficiency of D-RR over distributed SGD.
Abstract
In this paper, we consider distributed optimization problems where agents, each possessing a local cost function, collaboratively minimize the average of the local cost functions over a connected network. To solve the problem, we propose a distributed random reshuffling (D-RR) algorithm that invokes the random reshuffling (RR) update in each agent. We show that D-RR inherits favorable characteristics of RR for both smooth strongly convex and smooth nonconvex objective functions. In particular, for smooth strongly convex objective functions, D-RR achieves rate of convergence (where counts epoch number) in terms of the squared distance between the iterate and the global minimizer. When the objective function is assumed to be smooth nonconvex, we show that D-RR drives the squared norm of gradient to at a rate of . These convergence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Control Multi-Agent Systems · Stochastic Gradient Optimization Techniques · Random Matrices and Applications
