Comunication-Efficient Algorithms for Statistical Optimization

Yuchen Zhang; John C. Duchi; Martin Wainwright

arXiv:1209.4129·stat.ML·October 14, 2013·27 cites

Comunication-Efficient Algorithms for Statistical Optimization

Yuchen Zhang, John C. Duchi, Martin Wainwright

PDF

Open Access

TL;DR

This paper introduces and analyzes communication-efficient algorithms for distributed statistical optimization, achieving near-optimal error rates with minimal communication rounds, and demonstrates their effectiveness on large-scale real-world data.

Contribution

It presents a sharp analysis of a standard averaging method and introduces a novel bootstrap-based method for distributed optimization, both with improved error decay rates.

Findings

01

Averaging method achieves near-optimal error rate when m ≤ √N.

02

Bootstrap subsampling method requires only one communication round.

03

Stochastic gradient method offers a trade-off with slower convergence but easier computation.

Abstract

We analyze two communication-efficient algorithms for distributed statistical optimization on large-scale data sets. The first algorithm is a standard averaging method that distributes the $N$ data samples evenly to $\nummac$ machines, performs separate minimization on each subset, and then averages the estimates. We provide a sharp analysis of this average mixture algorithm, showing that under a reasonable set of conditions, the combined parameter achieves mean-squared error that decays as $\order (N^{- 1} + (N / m)^{- 2})$ . Whenever $m \leq N$ , this guarantee matches the best possible rate achievable by a centralized algorithm having access to all $\totalnumobs$ samples. The second algorithm is a novel method, based on an appropriate form of bootstrap subsampling. Requiring only a single round of communication, it has mean-squared error that decays as $\order (N^{- 1} + (N / m)^{- 3})$ , and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Advanced Bandit Algorithms Research