Memorization Through the Lens of Curvature of Loss Function Around Samples
Isha Garg, Deepak Ravikumar, Kaushik Roy

TL;DR
This paper introduces a curvature-based metric to measure memorization in deep neural networks, revealing new failure modes and correlating well with known memorization patterns, while being scalable and effective.
Contribution
It proposes a novel curvature-based measure of memorization that captures failure modes and correlates with existing metrics, offering a scalable alternative.
Findings
High curvature samples often correspond to mislabeled or conflicting data.
The method detects duplicated images with different labels as a failure mode.
Curvature scores effectively identify memorized corrupted samples.
Abstract
Deep neural networks are over-parameterized and easily overfit the datasets they train on. In the extreme case, it has been shown that these networks can memorize a training set with fully randomized labels. We propose using the curvature of loss function around each training sample, averaged over training epochs, as a measure of memorization of the sample. We use this metric to study the generalization versus memorization properties of different samples in popular image datasets and show that it captures memorization statistics well, both qualitatively and quantitatively. We first show that the high curvature samples visually correspond to long-tailed, mislabeled, or conflicting samples, those that are most likely to be memorized. This analysis helps us find, to the best of our knowledge, a novel failure mode on the CIFAR100 and ImageNet datasets: that of duplicated images with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Digital Imaging for Blood Diseases
