Faster Convergence & Generalization in DNNs

Gaurav Singh; John Shawe-Taylor

arXiv:1807.11414·cs.LG·October 11, 2018

Faster Convergence & Generalization in DNNs

Gaurav Singh, John Shawe-Taylor

PDF

Open Access

TL;DR

This paper introduces an optimization algorithm for deep neural networks that significantly accelerates convergence and enhances generalization, demonstrated by two benchmark datasets showing two orders of magnitude speedup and increased robustness.

Contribution

The authors develop a generalized-optimal update method for minibatch training that outperforms traditional back-propagation in speed and robustness.

Findings

01

Achieves two orders of magnitude faster training on benchmarks.

02

More robust to noise and overfitting.

03

Improves convergence speed and generalization in DNNs.

Abstract

Deep neural networks have gained tremendous popularity in last few years. They have been applied for the task of classification in almost every domain. Despite the success, deep networks can be incredibly slow to train for even moderate sized models on sufficiently large datasets. Additionally, these networks require large amounts of data to be able to generalize. The importance of speeding up convergence, and generalization in deep networks can not be overstated. In this work, we develop an optimization algorithm based on generalized-optimal updates derived from minibatches that lead to faster convergence. Towards the end, we demonstrate on two benchmark datasets that the proposed method achieves two orders of magnitude speed up over traditional back-propagation, and is more robust to noise/over-fitting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Neural Networks and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings