Descending through a Crowded Valley - Benchmarking Deep Learning   Optimizers

Robin M. Schmidt; Frank Schneider; Philipp Hennig

arXiv:2007.01547·cs.LG·August 12, 2021·34 cites

Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers

Robin M. Schmidt, Frank Schneider, Philipp Hennig

PDF

Open Access 1 Repo 3 Videos

TL;DR

This paper provides a comprehensive benchmark of fifteen deep learning optimizers across various tasks, revealing that optimizer choice impacts performance significantly and that tuning a single optimizer can be as effective as trying multiple defaults.

Contribution

It offers an extensive, standardized benchmark of popular optimizers, analyzing over 50,000 runs to provide evidence-backed heuristics and identify generally effective optimization strategies.

Findings

01

Optimizer performance varies greatly across tasks.

02

Default parameters for multiple optimizers perform similarly to tuned single optimizers.

03

Adam remains a consistently strong optimizer, with newer methods not outperforming it significantly.

Abstract

Choosing the optimizer is considered to be among the most crucial design decisions in deep learning, and it is not an easy one. The growing literature now lists hundreds of optimization methods. In the absence of clear theoretical guidance and conclusive empirical evidence, the decision is often made based on anecdotes. In this work, we aim to replace these anecdotes, if not with a conclusive ranking, then at least with evidence-backed heuristics. To do so, we perform an extensive, standardized benchmark of fifteen particularly popular deep learning optimizers while giving a concise overview of the wide range of possible choices. Analyzing more than $50, 000$ individual runs, we contribute the following three points: (i) Optimizer performance varies greatly across tasks. (ii) We observe that evaluating multiple optimizers with default parameters works approximately as well as tuning the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SirRob1997/Descending-through-a-Crowded-Valley---Results
noneOfficial

Videos

Descending through a Crowded Valley -- Benchmarking Deep Learning Optimizers (Paper Explained)· youtube

ICML 2021: Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers· youtube

Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Data Classification · Stochastic Gradient Optimization Techniques

MethodsAdam