Distributed Normal Map-based Stochastic Proximal Gradient Methods over Networks
Kun Huang, Shi Pu, Angelia Nedi\'c

TL;DR
This paper develops distributed stochastic proximal gradient algorithms for networked agents to optimize composite functions, achieving near-centralized convergence rates with state-of-the-art transient times.
Contribution
It introduces two novel algorithms, norM-DSGT and norM-ED, with proven convergence rates matching centralized methods under general variance conditions.
Findings
Both algorithms asymptotically match centralized stochastic proximal gradient rates.
Transient time for norM-ED is (n^3/(1-)^2), matching non-proximal ED.
Transient time for norM-DSGT is (max(n^3/(1-)^2, n/(1-)^3)), matching non-proximal DSGT.
Abstract
Consider agents connected over a network collaborating to minimize the average of their local cost functions combined with a common nonsmooth function. This paper introduces a unified algorithmic framework for solving such a problem through distributed stochastic proximal gradient methods, leveraging the normal map update scheme. Within this framework, we propose two new algorithms, termed Normal Map-based Distributed Stochastic Gradient Tracking (norM-DSGT) and Normal Map-based Exact Diffusion (norM-ED). We demonstrate that both methods can asymptotically achieve comparable convergence rates to the centralized stochastic proximal gradient descent method under a general variance condition on the stochastic gradients. Additionally, the number of iterations required for norM-ED to achieve such a rate (i.e., the transient time) behaves as for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
