On Communication Cost of Distributed Statistical Estimation and   Dimensionality

Ankit Garg; Tengyu Ma; Huy L. Nguyen

arXiv:1405.1665·cs.LG·November 11, 2014·50 cites

On Communication Cost of Distributed Statistical Estimation and Dimensionality

Ankit Garg, Tengyu Ma, Huy L. Nguyen

PDF

Open Access

TL;DR

This paper investigates the relationship between dimensionality and communication cost in distributed Gaussian mean estimation, establishing lower bounds and proposing protocols that optimize communication efficiency.

Contribution

It provides new lower bounds on communication costs for distributed estimation and introduces protocols that approach these bounds, especially for structured sparse parameters.

Findings

01

Communication cost scales linearly with dimensions.

02

New lower bounds of a(md/log(m)) and a(md) bits for interactive and simultaneous settings.

03

An interactive protocol achieving minimax squared loss with O(md) bits.

Abstract

We explore the connection between dimensionality and communication cost in distributed learning problems. Specifically we study the problem of estimating the mean $θ$ of an unknown $d$ dimensional gaussian distribution in the distributed setting. In this problem, the samples from the unknown distribution are distributed among $m$ different machines. The goal is to estimate the mean $θ$ at the optimal minimax rate while communicating as few bits as possible. We show that in this setting, the communication cost scales linearly in the number of dimensions i.e. one needs to deal with different dimensions individually. Applying this result to previous lower bounds for one dimension in the interactive setting \cite{ZDJW13} and to our improved bounds for the simultaneous setting, we prove new lower bounds of $Ω (m d / lo g (m))$ and $Ω (m d)$ for the bits of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Sensor Networks and Detection Algorithms · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques