Understanding Memorization from the Perspective of Optimization via Efficient Influence Estimation
Futong Liu, Tao Lin, Martin Jaggi

TL;DR
This paper investigates how over-parameterized neural networks memorize data by analyzing influence estimation during training, revealing that both real and random data are optimized simultaneously, with real data providing more informative difficult examples.
Contribution
It introduces an efficient influence estimation method to study memorization, demonstrating the simultaneous optimization of easy and difficult examples and the informativeness of correct difficult examples.
Findings
Both real and random data are optimized simultaneously.
Easy examples are optimized faster than difficult ones.
Correct difficult examples are more informative than easy ones.
Abstract
Over-parameterized deep neural networks are able to achieve excellent training accuracy while maintaining a small generalization error. It has also been found that they are able to fit arbitrary labels, and this behaviour is referred to as the phenomenon of memorization. In this work, we study the phenomenon of memorization with turn-over dropout, an efficient method to estimate influence and memorization, for data with true labels (real data) and data with random labels (random data). Our main findings are: (i) For both real data and random data, the optimization of easy examples (e.g., real data) and difficult examples (e.g., random data) are conducted by the network simultaneously, with easy ones at a higher speed; (ii) For real data, a correct difficult example in the training dataset is more informative than an easy one. By showing the existence of memorization on random data and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications · Fuzzy Logic and Control Systems
