TL;DR
This paper argues that the success of overparametrized deep neural networks is primarily due to their ability to exploit the compositional sparsity of target functions, which is fundamental to their learning dynamics and generalization.
Contribution
It introduces the concept that compositional sparsity underpins DNN success and discusses its implications for theory, learnability, and optimization in deep learning.
Findings
Compositional sparsity is a key property of functions DNNs exploit.
All efficiently Turing-computable functions share compositional sparsity.
Understanding this property is crucial for a comprehensive theory of deep learning.
Abstract
Overparametrized Deep Neural Networks (DNNs) have demonstrated remarkable success in a wide variety of domains too high-dimensional for classical shallow networks subject to the curse of dimensionality. However, open questions about fundamental principles, that govern the learning dynamics of DNNs, remain. In this position paper we argue that it is the ability of DNNs to exploit the compositionally sparse structure of the target function driving their success. As such, DNNs can leverage the property that most practically relevant functions can be composed from a small set of constituent functions, each of which relies only on a low-dimensional subset of all inputs. We show that this property is shared by all efficiently Turing-computable functions and is therefore highly likely present in all current learning problems. While some promising theoretical insights on questions concerned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
