Decoding Generalization from Memorization in Deep Neural Networks
Simran Ketha, Venkatakrishnan Ramaswamy

TL;DR
This paper investigates why overparameterized deep neural networks memorize labels but still generalize well, providing evidence that certain layers retain latent information for true labels, which can be decoded to understand generalization.
Contribution
The study demonstrates that models retain latent information for true labels in their representations and introduces a technique to decode this information from trained models.
Findings
Models retain information for true labels in their internal representations.
Decoding techniques can extract latent generalization information from trained networks.
Empirical evidence shows layers preserve potential for better generalization.
Abstract
Overparameterized deep networks that generalize well have been key to the dramatic success of deep learning in recent years. The reasons for their remarkable ability to generalize are not well understood yet. When class labels in the training set are shuffled to varying degrees, it is known that deep networks can still reach perfect training accuracy at the detriment of generalization to true labels -- a phenomenon that has been called memorization. It has, however, been unclear why the poor generalization to true labels that accompanies such memorization, comes about. One possibility is that during training, all layers of the network irretrievably re-organize their representations in a manner that makes generalization to true labels difficult. The other possibility is that one or more layers of the trained network retain significantly more latent ability to generalize to true labels,…
Peer Reviews
Decision·Submitted to ICLR 2025
Very comprehensive experiments.
1. Lack of literature: First, the paper lacks related works that don't provide pictures of previous work, the discussion of previous work only appears in the first paragraph of the introduction and most papers are focused on experiments. While there are many theoretical papers trying to understand the problem[1][2]. 2. Novelty: The entire Sec.3 tries to convey the idea: that the top component of the corrupted model still contains the class information to some degree. That is not a surprise since
- Understanding the relationship between generalization and memorization is an important and worthy area of study. - Good empirical breadth, with the analysis being applied across 5 standard datasets and three different architectures. - The authors seem reasonably well versed in at least some of the vast literature on the topic of memorization and generalization.
- I believe there is a methodological issue with the MASC based analysis. The original neural networks are trained to either 500 epochs or 99% - 100% percent accuracy on the training set. This means that the models are likely to be overfit on the training data, especially when training with corrupted data. This may well be the intention of the authors, but it has implications for their conclusions. In their analysis, the authors take a representation drawn from different layers in the architectu
The paper investigates important questions in a novel way. It is well written and the claims are mostly backed by evidence.
The paper's main weakness is a fairly important lack of situating itself with respect to prior work. A number of papers have probed intermediate layers and latent representations of DNNs [e.g. 2,3], in an attempt to understand this very memorization-vs-generalization debate. It does feel to me that many of the claims made in this paper are obvious in light of this literature. While the specific empirical investigation done here is novel to me, the conclusions drawn by the authors are not. It
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
