On the Number of Linear Regions of Deep Neural Networks
Guido Mont\'ufar, Razvan Pascanu, Kyunghyun Cho, Yoshua Bengio

TL;DR
This paper analyzes how the depth of neural networks with piecewise linear activations increases their complexity by exponentially expanding the number of linear regions, demonstrating the advantage of depth in function representation.
Contribution
It provides new theoretical bounds on the number of linear regions in deep networks, highlighting the exponential growth and the benefits of depth for piecewise linear activations.
Findings
Deep networks can exponentially increase the number of linear regions with depth.
Theoretical bounds on the complexity of functions computed by deep piecewise linear networks are improved.
Higher-layer units exhibit increased complexity and richer behavior.
Abstract
We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep networks are able to sequentially map portions of each layer's input-space to the same output. In this way, deep models compute functions that react equally to complicated patterns of different inputs. The compositional structure of these functions enables them to re-use pieces of computation exponentially often in terms of the network's depth. This paper investigates the complexity of such compositional maps and contributes new theoretical results regarding the advantage of depth for neural networks with piecewise linear activation functions. In particular, our analysis is not specific to a single family of models, and as an example, we employ it for rectifier and maxout networks. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
MethodsMaxout
