Distributed Estimation, Information Loss and Exponential Families

Qiang Liu; Alexander Ihler

arXiv:1410.2653·stat.ML·October 13, 2014·NeurIPS·30 cites

Distributed Estimation, Information Loss and Exponential Families

Qiang Liu, Alexander Ihler

PDF

Open Access

TL;DR

This paper analyzes a communication-efficient distributed learning framework for probabilistic models, revealing how the deviation from exponential family distributions affects estimation accuracy and comparing combination methods.

Contribution

It establishes the theoretical lower bound of estimation error based on distribution family properties and compares KL-divergence-based and linear combination methods.

Findings

01

KL-divergence-based combination achieves lower error bounds.

02

Full exponential family structure minimizes information loss.

03

KL method outperforms linear combination in practical scenarios.

Abstract

Distributed learning of probabilistic models from multiple data repositories with minimum communication is increasingly important. We study a simple communication-efficient learning framework that first calculates the local maximum likelihood estimates (MLE) based on the data subsets, and then combines the local MLEs to achieve the best possible approximation to the global MLE given the whole dataset. We study this framework's statistical properties, showing that the efficiency loss compared to the global setting relates to how much the underlying distribution families deviate from full exponential families, drawing connection to the theory of information loss by Fisher, Rao and Efron. We show that the "full-exponential-family-ness" represents the lower bound of the error rate of arbitrary combinations of local MLEs, and is achieved by a KL-divergence-based combination method but not by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Sensor Networks and Detection Algorithms · Bayesian Modeling and Causal Inference · Statistical Methods and Inference