On Communication Cost of Distributed Statistical Estimation and Dimensionality
Ankit Garg, Tengyu Ma, Huy L. Nguyen

TL;DR
This paper investigates the relationship between dimensionality and communication cost in distributed Gaussian mean estimation, establishing lower bounds and proposing protocols that optimize communication efficiency.
Contribution
It provides new lower bounds on communication costs for distributed estimation and introduces protocols that approach these bounds, especially for structured sparse parameters.
Findings
Communication cost scales linearly with dimensions.
New lower bounds of a(md/log(m)) and a(md) bits for interactive and simultaneous settings.
An interactive protocol achieving minimax squared loss with O(md) bits.
Abstract
We explore the connection between dimensionality and communication cost in distributed learning problems. Specifically we study the problem of estimating the mean of an unknown dimensional gaussian distribution in the distributed setting. In this problem, the samples from the unknown distribution are distributed among different machines. The goal is to estimate the mean at the optimal minimax rate while communicating as few bits as possible. We show that in this setting, the communication cost scales linearly in the number of dimensions i.e. one needs to deal with different dimensions individually. Applying this result to previous lower bounds for one dimension in the interactive setting \cite{ZDJW13} and to our improved bounds for the simultaneous setting, we prove new lower bounds of and for the bits of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Sensor Networks and Detection Algorithms · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques
