Guiding Neural Network Initialization via Marginal Likelihood Maximization
Anthony S. Tai, Chunfeng Huang

TL;DR
This paper introduces a data-driven method for selecting neural network hyperparameters at initialization by maximizing marginal likelihood, leading to improved performance and potential computational savings.
Contribution
It presents a novel approach that uses Gaussian process relationships to guide hyperparameter initialization, demonstrating effectiveness on MNIST with reduced computation.
Findings
Marginal likelihood maximization improves MNIST classification accuracy.
The method shows consistent results across experiments.
Computational cost can be reduced with smaller training sets.
Abstract
We propose a simple, data-driven approach to help guide hyperparameter selection for neural network initialization. We leverage the relationship between neural network and Gaussian process models having corresponding activation and covariance functions to infer the hyperparameter values desirable for model initialization. Our experiment shows that marginal likelihood maximization provides recommendations that yield near-optimal prediction performance on MNIST classification task under experiment constraints. Furthermore, our empirical results indicate consistency in the proposed technique, suggesting that computation cost for the procedure could be significantly reduced with smaller training sets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Neural Networks and Applications · Model Reduction and Neural Networks
MethodsGaussian Process
