Data Dependent Convergence for Distributed Stochastic Optimization
Avleen S. Bijral

TL;DR
This paper analyzes how the convergence rate of distributed stochastic gradient descent depends on the spectral properties of data covariance, offering insights into data-driven optimization strategies.
Contribution
It introduces a data-dependent convergence analysis for distributed SGD based on spectral properties, improving understanding of when distributed methods are most effective.
Findings
Convergence rate depends on the spectral norm of the data covariance matrix.
Sparse datasets with low spectral norm benefit more from distributed SGD.
Adding more machines can improve convergence for certain loss functions.
Abstract
In this dissertation we propose alternative analysis of distributed stochastic gradient descent (SGD) algorithms that rely on spectral properties of the data covariance. As a consequence we can relate questions pertaining to speedups and convergence rates for distributed SGD to the data distribution instead of the regularity properties of the objective functions. More precisely we show that this rate depends on the spectral norm of the sample covariance matrix. An estimate of this norm can provide practitioners with guidance towards a potential gain in algorithm performance. For example many sparse datasets with low spectral norm prove to be amenable to gains in distributed settings. Towards establishing this data dependence we first study a distributed consensus-based SGD algorithm and show that the rate of convergence involves the spectral norm of the sample covariance matrix when the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Distributed Control Multi-Agent Systems · Sparse and Compressive Sensing Techniques
MethodsStochastic Gradient Descent
