Expressive power of outer product manifolds on feed-forward neural   networks

B\'alint Dar\'oczy; Rita Aleksziev; Andr\'as Bencz\'ur

arXiv:1807.06630·cs.LG·July 19, 2018

Expressive power of outer product manifolds on feed-forward neural networks

B\'alint Dar\'oczy, Rita Aleksziev, Andr\'as Bencz\'ur

PDF

Open Access

TL;DR

This paper introduces a Riemannian geometric framework to analyze and optimize the expressive power of hierarchical feedforward neural networks, enabling efficient training and potential performance improvements.

Contribution

It develops a reparametrization invariant Riemannian metric to understand hierarchical structures, allowing early switching to shallow networks and improving training efficiency.

Findings

01

Approximate metric improves performance after few training epochs

02

Sparse representations enable switching to shallow networks

03

Method sometimes surpasses original network performance

Abstract

Hierarchical neural networks are exponentially more efficient than their corresponding "shallow" counterpart with the same expressive power, but involve huge number of parameters and require tedious amounts of training. Our main idea is to mathematically understand and describe the hierarchical structure of feedforward neural networks by reparametrization invariant Riemannian metrics. By computing or approximating the tangent subspace, we better utilize the original network via sparse representations that enables switching to shallow networks after a very early training stage. Our experiments show that the proposed approximation of the metric improves and sometimes even surpasses the achievable performance of the original network significantly even after a few epochs of training the original feedforward network.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Human Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis