What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation
Vitaly Feldman, Chiyuan Zhang

TL;DR
This paper investigates why neural networks memorize training data, especially in long-tailed distributions, by empirically testing a recent theory and demonstrating how memorization benefits generalization.
Contribution
The authors design efficient influence estimation methods to empirically validate a theory linking memorization to long-tailed data distributions in neural networks.
Findings
Memorization significantly improves generalization on benchmarks.
Empirical evidence supports the theory that long-tailed data necessitates memorization.
Efficient influence estimation methods enable analysis of training example impacts.
Abstract
Deep learning algorithms are well-known to have a propensity for fitting the training data very well and often fit even outliers and mislabeled data points. Such fitting requires memorization of training data labels, a phenomenon that has attracted significant research interest but has not been given a compelling explanation so far. A recent work of Feldman (2019) proposes a theoretical explanation for this phenomenon based on a combination of two insights. First, natural image and data distributions are (informally) known to be long-tailed, that is have a significant fraction of rare and atypical examples. Second, in a simple theoretical model such memorization is necessary for achieving close-to-optimal generalization error when the data distribution is long-tailed. However, no direct empirical evidence for this explanation or even an approach for obtaining such evidence were given.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
