Introducing Noise in Decentralized Training of Neural Networks

Linara Adilova; Nathalie Paul; and Peter Schlicht

arXiv:1809.10678·cs.LG·October 1, 2018·1 cites

Introducing Noise in Decentralized Training of Neural Networks

Linara Adilova, Nathalie Paul, and Peter Schlicht

PDF

Open Access

TL;DR

This paper explores the impact of noise injection during decentralized neural network training, showing it improves non-linear model generalization but has no expected benefit for linear models.

Contribution

It provides both theoretical and empirical analysis of noise injection effects in decentralized training, highlighting its benefits for non-linear neural networks.

Findings

01

Noise injection does not improve linear models in expectation.

02

Noise injection significantly enhances non-linear neural network generalization.

03

Empirical results show improved model quality with noise in decentralized training.

Abstract

It has been shown that injecting noise into the neural network weights during the training process leads to a better generalization of the resulting model. Noise injection in the distributed setup is a straightforward technique and it represents a promising approach to improve the locally trained models. We investigate the effects of noise injection into the neural networks during a decentralized training process. We show both theoretically and empirically that noise injection has no positive effect in expectation on linear models, though. However for non-linear neural networks we empirically show that noise injection substantially improves model quality helping to reach a generalization ability of a local model close to the serial baseline.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Gaussian Processes and Bayesian Inference