CMA-ES for Hyperparameter Optimization of Deep Neural Networks

Ilya Loshchilov; Frank Hutter

arXiv:1604.07269·cs.NE·April 26, 2016·240 cites

CMA-ES for Hyperparameter Optimization of Deep Neural Networks

Ilya Loshchilov, Frank Hutter

PDF

Open Access

TL;DR

This paper explores using CMA-ES, a derivative-free optimization method, for tuning deep neural network hyperparameters, demonstrating its advantages over Bayesian optimization in parallel settings.

Contribution

It introduces CMA-ES as an alternative hyperparameter optimization method for deep neural networks and evaluates its performance against Bayesian optimization.

Findings

01

CMA-ES performs competitively with Bayesian optimization.

02

CMA-ES is effective in parallel hyperparameter tuning.

03

The method shows promise for large-scale neural network optimization.

Abstract

Hyperparameters of deep neural networks are often optimized by grid search, random search or Bayesian optimization. As an alternative, we propose to use the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), which is known for its state-of-the-art performance in derivative-free optimization. CMA-ES has some useful invariance properties and is friendly to parallel evaluations of solutions. We provide a toy example comparing CMA-ES and state-of-the-art Bayesian optimization algorithms for tuning the hyperparameters of a convolutional neural network for the MNIST dataset on 30 GPUs in parallel.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Machine Learning and Algorithms