Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural   Networks

Xiaodong Cui; Wei Zhang; Zolt\'an T\"uske; Michael Picheny

arXiv:1810.06773·cs.LG·October 17, 2018·37 cites

Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks

Xiaodong Cui, Wei Zhang, Zolt\'an T\"uske, Michael Picheny

PDF

Open Access 1 Repo

TL;DR

This paper introduces ESGD, a hybrid optimization framework combining SGD and evolutionary algorithms to improve deep neural network training, demonstrating effectiveness across various applications and architectures.

Contribution

The paper presents ESGD, a novel population-based framework that integrates SGD and evolutionary strategies with complementary optimization steps for deep neural networks.

Findings

01

ESGD outperforms standard SGD in multiple tasks.

02

The framework maintains or improves the best fitness during training.

03

Different optimizer hyper-parameters enhance diversity and performance.

Abstract

We propose a population-based Evolutionary Stochastic Gradient Descent (ESGD) framework for optimizing deep neural networks. ESGD combines SGD and gradient-free evolutionary algorithms as complementary algorithms in one framework in which the optimization alternates between the SGD step and evolution step to improve the average fitness of the population. With a back-off strategy in the SGD step and an elitist strategy in the evolution step, it guarantees that the best fitness in the population will never degrade. In addition, individuals in the population optimized with various SGD-based optimizers using distinct hyper-parameters in the SGD step are considered as competing species in a coevolution setting such that the complementarity of the optimizers is also taken into account. The effectiveness of ESGD is demonstrated across multiple applications including speech recognition, image…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tqch/esgd-ws
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMetaheuristic Optimization Algorithms Research · Evolutionary Algorithms and Applications · Neural Networks and Applications

MethodsStochastic Gradient Descent