On the Expected Complexity of Maxout Networks
Hanna Tseran, Guido Mont\'ufar

TL;DR
This paper analyzes the complexity of maxout neural networks, revealing that their practical complexity varies widely across parameter space and can be influenced by initialization, impacting training efficiency.
Contribution
It extends complexity analysis to maxout networks, providing bounds and insights into how parameter initialization affects training convergence.
Findings
Maxout networks exhibit wide variability in complexity across parameter space.
Certain initializations can accelerate training convergence.
Expected complexity bounds are established for maxout networks.
Abstract
Learning with neural networks relies on the complexity of the representable functions, but more importantly, the particular assignment of typical parameters to functions of different complexity. Taking the number of activation regions as a complexity measure, recent works have shown that the practical complexity of deep ReLU networks is often far from the theoretical maximum. In this work, we show that this phenomenon also occurs in networks with maxout (multi-argument) activation functions and when considering the decision boundaries in classification tasks. We also show that the parameter space has a multitude of full-dimensional regions with widely different complexity, and obtain nontrivial lower bounds on the expected complexity. Finally, we investigate different parameter initialization procedures and show that they can increase the speed of convergence in training.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms
MethodsMaxout
