An effective algorithm for hyperparameter optimization of neural   networks

Gonzalo Diaz; Achille Fokoue; Giacomo Nannicini; Horst Samulowitz

arXiv:1705.08520·cs.AI·May 25, 2017

An effective algorithm for hyperparameter optimization of neural networks

Gonzalo Diaz, Achille Fokoue, Giacomo Nannicini, Horst Samulowitz

PDF

TL;DR

This paper presents an automatic, derivative-free optimization algorithm that efficiently searches for optimal neural network hyperparameters by modeling the objective function with radial basis functions, reducing training time.

Contribution

It introduces a novel hyperparameter optimization method using a radial basis function model to accelerate neural network tuning, applicable to various datasets.

Findings

01

Effective in finding high-accuracy configurations

02

Reduces training time by evaluating fewer candidates

03

Shows promising results on benchmark and drug interaction datasets

Abstract

A major challenge in designing neural network (NN) systems is to determine the best structure and parameters for the network given the data for the machine learning problem at hand. Examples of parameters are the number of layers and nodes, the learning rates, and the dropout rates. Typically, these parameters are chosen based on heuristic rules and manually fine-tuned, which may be very time-consuming, because evaluating the performance of a single parametrization of the NN may require several hours. This paper addresses the problem of choosing appropriate parameters for the NN by formulating it as a box-constrained mathematical optimization problem, and applying a derivative-free optimization tool that automatically and effectively searches the parameter space. The optimization tool employs a radial basis function model of the objective function (the prediction accuracy of the NN) to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDropout