Decoupled Parallel Backpropagation with Convergence Guarantee
Zhouyuan Huo, Bin Gu, Qian Yang, Heng Huang

TL;DR
This paper introduces a decoupled parallel backpropagation algorithm that removes backward locking in deep neural network training, guarantees convergence, and achieves significant speedup without accuracy loss.
Contribution
It proposes a novel decoupled parallel backpropagation method with convergence guarantees, enabling efficient training of deep networks in parallel.
Findings
Achieves significant training speedup
Maintains comparable accuracy to standard backpropagation
Proven convergence to critical points for non-convex optimization
Abstract
Backpropagation algorithm is indispensable for the training of feedforward neural networks. It requires propagating error gradients sequentially from the output layer all the way back to the input layer. The backward locking in backpropagation algorithm constrains us from updating network layers in parallel and fully leveraging the computing resources. Recently, several algorithms have been proposed for breaking the backward locking. However, their performances degrade seriously when networks are deep. In this paper, we propose decoupled parallel backpropagation algorithm for deep learning optimization with convergence guarantee. Firstly, we decouple the backpropagation algorithm using delayed gradients, and show that the backward locking is removed when we split the networks into multiple modules. Then, we utilize decoupled parallel backpropagation in two stochastic methods and prove…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Advanced Neural Network Applications
