A Practical Approach to Sizing Neural Networks
Gerald Friedland, Alfredo Metere, Mario Krell

TL;DR
This paper presents practical rules and heuristics for estimating the optimal size of neural networks based on data set characteristics, aiming to improve efficiency and generalization.
Contribution
It introduces four analytical rules for neural network capacity estimation and a validated heuristic for dataset-specific size prediction.
Findings
Four capacity estimation rules for neural networks
A heuristic method for dataset-specific network sizing
Discussion on consequences of incorrect network sizing
Abstract
Memorization is worst-case generalization. Based on MacKay's information theoretic model of supervised machine learning, this article discusses how to practically estimate the maximum size of a neural network given a training data set. First, we present four easily applicable rules to analytically determine the capacity of neural network architectures. This allows the comparison of the efficiency of different network architectures independently of a task. Second, we introduce and experimentally validate a heuristic method to estimate the neural network capacity requirement for a given dataset and labeling. This allows an estimate of the required size of a neural network for a given problem. We conclude the article with a discussion on the consequences of sizing the network wrongly, which includes both increased computation effort for training as well as reduced generalization capability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification
