Optimization and Analysis of Distributed Averaging with Short Node Memory
Boris N. Oreshkin, Mark J. Coates, Michael G. Rabbat

TL;DR
This paper introduces a method that adds a local prediction component to distributed averaging algorithms, significantly improving their convergence rates through theoretical analysis and numerical validation.
Contribution
It proposes a novel acceleration technique using a two-memory predictor, deriving optimal parameters and demonstrating substantial convergence improvements for chain and grid topologies.
Findings
Achieves a factor of n improvement on chain topologies.
Attains a factor of n^1/2 improvement on grid topologies.
Provides theoretical and numerical validation of the acceleration method.
Abstract
In this paper, we demonstrate, both theoretically and by numerical examples, that adding a local prediction component to the update rule can significantly improve the convergence rate of distributed averaging algorithms. We focus on the case where the local predictor is a linear combination of the node's two previous values (i.e., two memory taps), and our update rule computes a combination of the predictor and the usual weighted linear combination of values received from neighbouring nodes. We derive the optimal mixing parameter for combining the predictor with the neighbors' values, and carry out a theoretical analysis of the improvement in convergence rate that can be obtained using this acceleration methodology. For a chain topology on n nodes, this leads to a factor of n improvement over the one-step algorithm, and for a two-dimensional grid, our approach achieves a factor of n^1/2…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
