Envisioning Future Deep Learning Theories: Some Basic Concepts and Characteristics
Weijie J. Su

TL;DR
This paper proposes a conceptual framework for future deep learning theories, emphasizing hierarchical structures, iterative optimization, and compressive information evolution, exemplified by the neurashed model that explains key empirical phenomena.
Contribution
It introduces the neurashed model integrating hierarchical, iterative, and compressive data characteristics, offering a new perspective for understanding deep learning effectiveness.
Findings
Neurashed explains implicit regularization effects.
It provides insights into the information bottleneck phenomenon.
The model elucidates local elasticity in neural networks.
Abstract
To advance deep learning methodologies in the next decade, a theoretical framework for reasoning about modern neural networks is needed. While efforts are increasing toward demystifying why deep learning is so effective, a comprehensive picture remains lacking, suggesting that a better theory is possible. We argue that a future deep learning theory should inherit three characteristics: a \textit{hierarchically} structured network architecture, parameters \textit{iteratively} optimized using stochastic gradient-based methods, and information from the data that evolves \textit{compressively}. As an instantiation, we integrate these characteristics into a graphical model called \textit{neurashed}. This model effectively explains some common empirical patterns in deep learning. In particular, neurashed enables insights into implicit regularization, information bottleneck, and local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Machine Learning and Data Classification
