Predicting Parameters in Deep Learning
Misha Denil, Babak Shakibi, Laurent Dinh, Marc'Aurelio Ranzato, Nando, de Freitas

TL;DR
This paper shows that deep learning models contain significant redundancy, allowing most parameters to be predicted from a few known weights without losing accuracy, thus reducing the amount of learning required.
Contribution
It introduces a method to predict the majority of network parameters from a small subset, demonstrating substantial redundancy in deep learning models.
Findings
Over 95% of weights can be accurately predicted without accuracy loss
Many parameters do not need to be learned explicitly
Redundancy enables parameter reduction in deep networks
Abstract
We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small number of weights and predicting the rest. In the best case we are able to predict more than 95% of the weights of a network without any drop in accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification · Generative Adversarial Networks and Image Synthesis
