Limiting Network Size within Finite Bounds for Optimization
Linu Pinto, Sasi Gopalan

TL;DR
This paper provides a theoretical framework for bounding the size of neural network layers, ensuring optimal classification performance with minimal complexity, supported by experimental validation.
Contribution
It introduces a theoretical justification for the necessary size of hidden layers in shallow networks for binary classification, establishing bounds on network width.
Findings
Bounded hidden layer size leads to efficient training.
Theoretical bounds are validated experimentally on multiple datasets.
Optimal network size improves classification with minimal overfitting.
Abstract
Largest theoretical contribution to Neural Networks comes from VC Dimension which characterizes the sample complexity of classification model in a probabilistic view and are widely used to study the generalization error. So far in the literature the VC Dimension has only been used to approximate the generalization error bounds on different Neural Network architectures. VC Dimension has not yet been implicitly or explicitly stated to fix the network size which is important as the wrong configuration could lead to high computation effort in training and leads to over fitting. So there is a need to bound these units so that task can be computed with only sufficient number of parameters. For binary classification tasks shallow networks are used as they have universal approximation property and it is enough to size the hidden layer width for such networks. The paper brings out a theoretical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Algorithms · Machine Learning and ELM
