Understanding Synthetic Gradients and Decoupled Neural Interfaces

Wojciech Marian Czarnecki; Grzegorz \'Swirszcz; Max Jaderberg; Simon; Osindero; Oriol Vinyals; Koray Kavukcuoglu

arXiv:1703.00522·cs.LG·March 3, 2017·28 cites

Understanding Synthetic Gradients and Decoupled Neural Interfaces

Wojciech Marian Czarnecki, Grzegorz \'Swirszcz, Max Jaderberg, Simon, Osindero, Oriol Vinyals, Koray Kavukcuoglu

PDF

Open Access 1 Repo

TL;DR

This paper investigates how synthetic gradients and decoupled neural interfaces affect neural network training, focusing on their influence on optimization, representations, and convergence, and compares them to other error approximation methods.

Contribution

It provides a detailed analysis of the functional and representational impacts of synthetic gradients, including convergence proofs and a unifying framework for error approximation techniques.

Findings

01

SGs do not impair the representational capacity of neural networks.

02

The learning system converges for linear and deep linear models with SGs.

03

Synthetic gradients can lead to different layer-wise representations compared to true gradients.

Abstract

When training neural networks, the use of Synthetic Gradients (SG) allows layers or modules to be trained without update locking - without waiting for a true error gradient to be backpropagated - resulting in Decoupled Neural Interfaces (DNIs). This unlocked ability of being able to update parts of a neural network asynchronously and with only local information was demonstrated to work empirically in Jaderberg et al (2016). However, there has been very little demonstration of what changes DNIs and SGs impose from a functional, representational, and learning dynamics point of view. In this paper, we study DNIs through the use of synthetic gradients on feed-forward networks to better understand their behaviour and elucidate their effect on optimisation. We show that the incorporation of SGs does not affect the representational strength of the learning system for a neural network, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

quangvu0702/Synthetic-Gradients
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Memory and Neural Computing · Quantum Computing Algorithms and Architecture