Minimum number of neurons in fully connected layers of a given neural network (the first approximation)
Oleg I.Berngardt

TL;DR
This paper introduces an algorithm that estimates the minimum number of neurons needed in fully connected layers of neural networks without multiple training, using a combination of cross-validation and singular value decomposition.
Contribution
It presents the first approximation method for determining the minimal neurons in a layer based on network architecture, dataset, and training metrics, without extensive retraining.
Findings
The minimum number of neurons is an internal property influenced by architecture and data.
The algorithm can estimate minimal neurons independently for each layer.
Tested on classification and regression datasets, showing practical applicability.
Abstract
This paper presents an algorithm for searching for the minimum number of neurons in fully connected layers of an arbitrary network solving given problem, which does not require multiple training of the network with different number of neurons. The algorithm is based at training the initial wide network using the cross-validation method over at least two folds. Then by using truncated singular value decomposition autoencoder inserted after the studied layer of trained network we search the minimum number of neurons in inference only mode of the network. It is shown that the minimum number of neurons in a fully connected layer could be interpreted not as network hyperparameter associated with the other hyperparameters of the network, but as internal (latent) property of the solution, determined by the network architecture, the training dataset, layer position, and the quality metric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Statistical and Computational Modeling · Machine Learning and ELM
