Does the Adam Optimizer Exacerbate Catastrophic Forgetting?

Dylan R. Ashley; Sina Ghiassian; Richard S. Sutton

arXiv:2102.07686·cs.LG·June 10, 2021

Does the Adam Optimizer Exacerbate Catastrophic Forgetting?

Dylan R. Ashley, Sina Ghiassian, Richard S. Sutton

PDF

Open Access 1 Repo

TL;DR

This paper investigates how different optimization algorithms, especially Adam versus SGD, influence catastrophic forgetting in neural networks, revealing that classical methods sometimes outperform modern ones and emphasizing the need for rigorous measurement metrics.

Contribution

It provides empirical evidence that optimizer choice significantly affects catastrophic forgetting and highlights the importance of using multiple metrics for accurate assessment.

Findings

01

Classical SGD can cause less forgetting than Adam in some cases.

02

The choice of forgetting metrics dramatically influences study conclusions.

03

A comprehensive evaluation requires multiple, concurrent metrics.

Abstract

Catastrophic forgetting remains a severe hindrance to the broad application of artificial neural networks (ANNs), however, it continues to be a poorly understood phenomenon. Despite the extensive amount of work on catastrophic forgetting, we argue that it is still unclear how exactly the phenomenon should be quantified, and, moreover, to what degree all of the choices we make when designing learning systems affect the amount of catastrophic forgetting. We use various testbeds from the reinforcement learning and supervised learning literature to (1) provide evidence that the choice of which modern gradient-based optimization algorithm is used to train an ANN has a significant impact on the amount of catastrophic forgetting and show that-surprisingly-in many instances classical algorithms such as vanilla SGD experience less catastrophic forgetting than the more modern algorithms such as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dylanashley/catastrophic-forgetting
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics · Artificial Intelligence in Games

MethodsAdam · Stochastic Gradient Descent