On the Optimality of Averaging in Distributed Statistical Learning

Jonathan Rosenblatt; Boaz Nadler

arXiv:1407.2724·stat.ML·June 14, 2016

On the Optimality of Averaging in Distributed Statistical Learning

Jonathan Rosenblatt, Boaz Nadler

PDF

1 Repo

TL;DR

This paper analyzes the statistical error of averaging in distributed empirical risk minimization, showing it is asymptotically optimal in fixed dimensions but incurs a linear accuracy loss in high-dimensional regimes, guiding practical machine learning deployment.

Contribution

It provides asymptotically exact error expressions for distributed averaging in both fixed and high-dimensional settings, revealing when averaging is optimal or incurs a loss.

Findings

01

Averaging is asymptotically as accurate as centralized solutions in fixed dimensions.

02

In high-dimensional regimes, averaging causes a linear accuracy loss with the number of machines.

03

The results inform optimal choices for the number of machines in distributed learning.

Abstract

A common approach to statistical learning with big-data is to randomly split it among $m$ machines and learn the parameter of interest by averaging the $m$ individual estimates. In this paper, focusing on empirical risk minimization, or equivalently M-estimation, we study the statistical error incurred by this strategy. We consider two large-sample settings: First, a classical setting where the number of parameters $p$ is fixed, and the number of samples per machine $n \to \infty$ . Second, a high-dimensional regime where both $p, n \to \infty$ with $p / n \to κ \in (0, 1)$ . For both regimes and under suitable assumptions, we present asymptotically exact expressions for this estimation error. In the fixed- $p$ setting, under suitable assumptions, we prove that to leading order averaging is as accurate as the centralized solution. We also derive the second order error terms, and show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

johnros/ParalSimulate
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.