Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture
Libin Zhu, Chaoyue Liu, Mikhail Belkin

TL;DR
This paper proves that general neural networks with directed acyclic graph architectures become linear as their width increases, revealing the underlying mathematical structure and generalizing previous results on neural tangent kernels.
Contribution
It extends the understanding of transition to linearity in neural networks to arbitrary DAG architectures, based on the minimum in-degree of neurons.
Findings
Networks become linear as width approaches infinity.
Transition to linearity depends on the minimum in-degree.
Generalizes previous results on Neural Tangent Kernel.
Abstract
In this paper we show that feedforward neural networks corresponding to arbitrary directed acyclic graphs undergo transition to linearity as their "width" approaches infinity. The width of these general networks is characterized by the minimum in-degree of their neurons, except for the input and first layers. Our results identify the mathematical structure underlying transition to linearity and generalize a number of recent works aimed at characterizing transition to linearity or constancy of the Neural Tangent Kernel for standard architectures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Topological and Geometric Data Analysis
