Faster Convergence & Generalization in DNNs
Gaurav Singh, John Shawe-Taylor

TL;DR
This paper introduces an optimization algorithm for deep neural networks that significantly accelerates convergence and enhances generalization, demonstrated by two benchmark datasets showing two orders of magnitude speedup and increased robustness.
Contribution
The authors develop a generalized-optimal update method for minibatch training that outperforms traditional back-propagation in speed and robustness.
Findings
Achieves two orders of magnitude faster training on benchmarks.
More robust to noise and overfitting.
Improves convergence speed and generalization in DNNs.
Abstract
Deep neural networks have gained tremendous popularity in last few years. They have been applied for the task of classification in almost every domain. Despite the success, deep networks can be incredibly slow to train for even moderate sized models on sufficiently large datasets. Additionally, these networks require large amounts of data to be able to generalize. The importance of speeding up convergence, and generalization in deep networks can not be overstated. In this work, we develop an optimization algorithm based on generalized-optimal updates derived from minibatches that lead to faster convergence. Towards the end, we demonstrate on two benchmark datasets that the proposed method achieves two orders of magnitude speed up over traditional back-propagation, and is more robust to noise/over-fitting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Neural Networks and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
