Opening the Black Box: predicting the trainability of deep neural networks with reconstruction entropy
Yanick Thurn, Ro Jefferson, Johanna Erdmenger

TL;DR
This paper introduces a method to predict the trainability of deep neural networks early in training by reconstructing inputs from activations, significantly reducing training time across various architectures and datasets.
Contribution
The authors propose a novel input reconstruction entropy measure to predict network trainability after just one epoch, applicable to multiple neural network types.
Findings
Reconstruction entropy correlates with network trainability.
Method reduces training time by predicting untrainable networks early.
Applicable to DNNs, ResNets, and CNNs.
Abstract
An important challenge in machine learning is to predict the initial conditions under which a given neural network will be trainable. We present a method for predicting the trainable regime in parameter space for deep feedforward neural networks (DNNs) based on reconstructing the input from subsequent activation layers via a cascade of single-layer auxiliary networks. We show that a single epoch of training of the shallow cascade networks is sufficient to predict the trainability of the deep feedforward network on a range of datasets (MNIST, CIFAR10, FashionMNIST, and white noise), thereby providing a significant reduction in overall training time. We achieve this by computing the relative entropy between reconstructed images and the original inputs, and show that this probe of information loss is sensitive to the phase behaviour of the network. We further demonstrate that this method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsDense Connections · Feedforward Network
