Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian   Optimization and Tuning Rules

Michele Fraccaroli; Evelina Lamma; Fabrizio Riguzzi

arXiv:2006.02105·cs.LG·June 4, 2020

Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian Optimization and Tuning Rules

Michele Fraccaroli, Evelina Lamma, Fabrizio Riguzzi

PDF

TL;DR

This paper proposes a novel hyper-parameter tuning method for deep neural networks that combines Bayesian Optimization with tuning rules to improve efficiency and effectiveness in hyper-parameter selection.

Contribution

It introduces a new algorithm that evaluates network results and applies tuning rules to enhance Bayesian Optimization for DNN hyper-parameter tuning.

Findings

01

Improved hyper-parameter tuning efficiency.

02

Enhanced accuracy of DNN models.

03

Reduced search space for hyper-parameters.

Abstract

Deep learning techniques play an increasingly important role in industrial and research environments due to their outstanding results. However, the large number of hyper-parameters to be set may lead to errors if they are set manually. The state-of-the-art hyper-parameters tuning methods are grid search, random search, and Bayesian Optimization. The first two methods are expensive because they try, respectively, all possible combinations and random combinations of hyper-parameters. Bayesian Optimization, instead, builds a surrogate model of the objective function, quantifies the uncertainty in the surrogate using Gaussian Process Regression and uses an acquisition function to decide where to sample the new set of hyper-parameters. This work faces the field of Hyper-Parameters Optimization (HPO). The aim is to improve Bayesian Optimization applied to Deep Neural Networks. For this goal,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsGaussian Process