A Practical Approach to Sizing Neural Networks

Gerald Friedland; Alfredo Metere; Mario Krell

arXiv:1810.02328·cs.NE·October 5, 2018

A Practical Approach to Sizing Neural Networks

Gerald Friedland, Alfredo Metere, Mario Krell

PDF

Open Access

TL;DR

This paper presents practical rules and heuristics for estimating the optimal size of neural networks based on data set characteristics, aiming to improve efficiency and generalization.

Contribution

It introduces four analytical rules for neural network capacity estimation and a validated heuristic for dataset-specific size prediction.

Findings

01

Four capacity estimation rules for neural networks

02

A heuristic method for dataset-specific network sizing

03

Discussion on consequences of incorrect network sizing

Abstract

Memorization is worst-case generalization. Based on MacKay's information theoretic model of supervised machine learning, this article discusses how to practically estimate the maximum size of a neural network given a training data set. First, we present four easily applicable rules to analytically determine the capacity of neural network architectures. This allows the comparison of the efficiency of different network architectures independently of a task. Second, we introduce and experimentally validate a heuristic method to estimate the neural network capacity requirement for a given dataset and labeling. This allows an estimate of the required size of a neural network for a given problem. We conclude the article with a discussion on the consequences of sizing the network wrongly, which includes both increased computation effort for training as well as reduced generalization capability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification