TL;DR
This paper presents a unified theoretical framework for analyzing and constructing deep neural networks by modeling tensor operations, revealing links between architectural complexity and innovation, and providing a dataset of complex architectures.
Contribution
It introduces a novel framework that explicitly models tensor operations in neural networks, enabling analysis of architectural evolution and automatic design of complex architectures.
Findings
Architectural complexity correlates with groundbreaking neural network designs.
Identified unexplored classes of high-complexity architectures.
Publicly released a dataset of over 3,000 complex neural architectures.
Abstract
We introduce a unified theoretical framework for the rigorous analysis and systematic construction of deep neural networks (DNNs). This framework addresses a gap in existing theory by explicitly modeling the structure of tensor operations -- lower level information that is often abstracted. Our framework enables two novel objectives: (1) analysis of the evolution of architectural complexity over deep learning history, and (2) automatic construction of novel architectures based on new types of tensor operations. Our study of DNNs introduced over the past 40 years reveals a connection between groundbreaking architectures and increases in different types of architectural complexity. Moreover, we identify several large classes of higher complexity architectures that have not yet been explored. We then collect a dataset of 3,000+ higher complexity architectures, which we publicly release at:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
