An Empirical Study on Compressed Decentralized Stochastic Gradient   Algorithms with Overparameterized Models

Arjun Ashok Rao; Hoi-To Wai

arXiv:2110.04523·math.OC·October 12, 2021

An Empirical Study on Compressed Decentralized Stochastic Gradient Algorithms with Overparameterized Models

Arjun Ashok Rao, Hoi-To Wai

PDF

Open Access

TL;DR

This paper empirically investigates how compressed decentralized stochastic gradient algorithms perform with overparameterized neural networks, revealing robustness in convergence rates and highlighting gaps between theory and practice.

Contribution

It provides the first empirical analysis of compressed DSG algorithms with overparameterized NNs, demonstrating their practical robustness across network sizes.

Findings

01

Convergence rates are robust to neural network size.

02

There is a gap between theoretical predictions and empirical performance.

03

Compressed DSG algorithms perform well in real-world network simulations.

Abstract

This paper considers decentralized optimization with application to machine learning on graphs. The growing size of neural network (NN) models has motivated prior works on decentralized stochastic gradient algorithms to incorporate communication compression. On the other hand, recent works have demonstrated the favorable convergence and generalization properties of overparameterized NNs. In this work, we present an empirical analysis on the performance of compressed decentralized stochastic gradient (DSG) algorithms with overparameterized NNs. Through simulations on an MPI network environment, we observe that the convergence rates of popular compressed DSG algorithms are robust to the size of NNs. Our findings suggest a gap between theories and practice of the compressed DSG algorithms in the existing literature.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Distributed Control Multi-Agent Systems