Decoupled Parallel Backpropagation with Convergence Guarantee

Zhouyuan Huo; Bin Gu; Qian Yang; Heng Huang

arXiv:1804.10574·cs.LG·July 24, 2018·32 cites

Decoupled Parallel Backpropagation with Convergence Guarantee

Zhouyuan Huo, Bin Gu, Qian Yang, Heng Huang

PDF

Open Access 3 Repos

TL;DR

This paper introduces a decoupled parallel backpropagation algorithm that removes backward locking in deep neural network training, guarantees convergence, and achieves significant speedup without accuracy loss.

Contribution

It proposes a novel decoupled parallel backpropagation method with convergence guarantees, enabling efficient training of deep networks in parallel.

Findings

01

Achieves significant training speedup

02

Maintains comparable accuracy to standard backpropagation

03

Proven convergence to critical points for non-convex optimization

Abstract

Backpropagation algorithm is indispensable for the training of feedforward neural networks. It requires propagating error gradients sequentially from the output layer all the way back to the input layer. The backward locking in backpropagation algorithm constrains us from updating network layers in parallel and fully leveraging the computing resources. Recently, several algorithms have been proposed for breaking the backward locking. However, their performances degrade seriously when networks are deep. In this paper, we propose decoupled parallel backpropagation algorithm for deep learning optimization with convergence guarantee. Firstly, we decouple the backpropagation algorithm using delayed gradients, and show that the backward locking is removed when we split the networks into multiple modules. Then, we utilize decoupled parallel backpropagation in two stochastic methods and prove…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Advanced Neural Network Applications