An Empirical Study of Example Forgetting during Deep Neural Network   Learning

Mariya Toneva; Alessandro Sordoni; Remi Tachet des Combes; Adam; Trischler; Yoshua Bengio; Geoffrey J. Gordon

arXiv:1812.05159·cs.LG·November 18, 2019·114 cites

An Empirical Study of Example Forgetting during Deep Neural Network Learning

Mariya Toneva, Alessandro Sordoni, Remi Tachet des Combes, Adam, Trischler, Yoshua Bengio, Geoffrey J. Gordon

PDF

Open Access 3 Repos 3 Datasets

TL;DR

This paper investigates the phenomenon of example forgetting during neural network training on single tasks, revealing that some examples are repeatedly forgotten, others are never forgotten, and that training data can be reduced without loss of performance.

Contribution

It introduces the concept of forgetting events in neural networks, analyzes their dynamics across datasets and architectures, and shows data reduction is possible without sacrificing accuracy.

Findings

01

Certain examples are forgotten frequently, others not at all.

02

Forgettable examples generalize across architectures.

03

Training data can be reduced while maintaining performance.

Abstract

Inspired by the phenomenon of catastrophic forgetting, we investigate the learning dynamics of neural networks as they train on single classification tasks. Our goal is to understand whether a related phenomenon occurs when data does not undergo a clear distributional shift. We define a `forgetting event' to have occurred when an individual training example transitions from being classified correctly to incorrectly over the course of learning. Across several benchmark data sets, we find that: (i) certain examples are forgotten with high frequency, and some not at all; (ii) a data set's (un)forgettable examples generalize across neural architectures; and (iii) based on forgetting dynamics, a significant fraction of examples can be omitted from the training data set while still maintaining state-of-the-art generalization performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification