Exponentially Increasing the Capacity-to-Computation Ratio for Conditional Computation in Deep Learning
Kyunghyun Cho, Yoshua Bengio

TL;DR
This paper introduces a novel neural network parametrization that exponentially increases the capacity-to-computation ratio by activating parameters based on specific activation patterns, enabling more efficient conditional computation.
Contribution
The authors propose a tree-structured parametrization that activates weight matrices based on hidden unit activation patterns, significantly boosting capacity without proportional computation increase.
Findings
Potential to exponentially increase capacity-to-computation ratio
Tree-structured parametrization controls overfitting
Activation based on hidden unit sign patterns
Abstract
Many state-of-the-art results obtained with deep networks are achieved with the largest models that could be trained, and if more computation power was available, we might be able to exploit much larger datasets in order to improve generalization ability. Whereas in learning algorithms such as decision trees the ratio of capacity (e.g., the number of parameters) to computation is very favorable (up to exponentially more parameters than computation), the ratio is essentially 1 for deep neural networks. Conditional computation has been proposed as a way to increase the capacity of a deep neural network without increasing the amount of computation required, by activating some parameters and computation "on-demand", on a per-example basis. In this note, we propose a novel parametrization of weight matrices in neural networks which has the potential to increase up to exponentially the ratio…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Face and Expression Recognition · Advanced Graph Neural Networks
