Natural Neural Networks

Guillaume Desjardins; Karen Simonyan; Razvan Pascanu; Koray; Kavukcuoglu

arXiv:1507.00210·stat.ML·July 2, 2015·60 cites

Natural Neural Networks

Guillaume Desjardins, Karen Simonyan, Razvan Pascanu, Koray, Kavukcuoglu

PDF

Open Access 1 Repo

TL;DR

Natural Neural Networks introduce a reparametrization technique that improves training efficiency by better conditioning the Fisher matrix, demonstrated on large-scale image classification tasks.

Contribution

The paper presents a new family of algorithms with a reparametrization method and a scalable natural gradient-based training algorithm for neural networks.

Findings

01

Faster convergence in training neural networks.

02

Effective on both supervised and unsupervised tasks.

03

Scalable to large datasets like ImageNet.

Abstract

We introduce Natural Neural Networks, a novel family of algorithms that speed up convergence by adapting their internal representation during training to improve conditioning of the Fisher matrix. In particular, we show a specific example that employs a simple and efficient reparametrization of the neural network weights by implicitly whitening the representation obtained at each layer, while preserving the feed-forward computation of the network. Such networks can be trained efficiently via the proposed Projected Natural Gradient Descent algorithm (PRONG), which amortizes the cost of these reparametrizations over many parameter updates and is closely related to the Mirror Descent online learning algorithm. We highlight the benefits of our method on both unsupervised and supervised learning tasks, and showcase its scalability by training on the large-scale ImageNet Challenge dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

awur978/Autoencoder
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications