Optimizing Stochastic Gradient Push under Broadcast Communications

Tuan Nguyen; Ting He

arXiv:2604.15549·cs.LG·April 20, 2026

Optimizing Stochastic Gradient Push under Broadcast Communications

Tuan Nguyen, Ting He

PDF

TL;DR

This paper develops an optimized mixing matrix design for stochastic gradient push in decentralized federated learning over wireless networks, reducing convergence time by leveraging directed communication graphs.

Contribution

It introduces a novel mixing matrix design algorithm for SGP that allows asymmetric matrices and directed graphs, improving convergence time in DFL.

Findings

01

Proposed design reduces convergence time significantly.

02

Allows asymmetric mixing matrices for directed graphs.

03

Achieves better performance without sacrificing model quality.

Abstract

We consider the problem of minimizing the convergence time for decentralized federated learning (DFL) in wireless networks under broadcast communications, with focus on mixing matrix design. The mixing matrix is a critical hyperparameter for DFL that simultaneously controls the convergence rate across iterations and the communication demand per iteration, both strongly influencing the convergence time. Although the problem has been studied previously, existing solutions are mostly designed for decentralized parallel stochastic gradient descent (D-PSGD), which requires the mixing matrix to be symmetric and doubly stochastic. These constraints confine the activated communication graph to undirected (i.e., bidirected) graphs, which limits design flexibility. In contrast, we consider mixing matrix design for stochastic gradient push (SGP), which allows asymmetric mixing matrices and hence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.