Generalizability of Memorization Neural Networks
Lijia Yu, Xiao-Shan Gao, Lijun Zhang, Yibo Miao

TL;DR
This paper provides the first theoretical analysis of the generalizability of memorization neural networks, revealing conditions under which they can or cannot generalize based on network width, size, and data distribution.
Contribution
It introduces theoretical frameworks for memorization networks, including minimal parameter construction, conditions for generalizability, and bounds on sample complexity.
Findings
Networks must have width at least equal to data dimension for generalization.
Optimal parameter networks may not be generalizable.
Sample complexity bounds depend on data distribution and network size.
Abstract
The neural network memorization problem is to study the expressive power of neural networks to interpolate a finite dataset. Although memorization is widely believed to have a close relationship with the strong generalizability of deep learning when using over-parameterized models, to the best of our knowledge, there exists no theoretical study on the generalizability of memorization neural networks. In this paper, we give the first theoretical analysis of this topic. Since using i.i.d. training data is a necessary condition for a learning algorithm to be generalizable, memorization and its generalization theory for i.i.d. datasets are developed under mild conditions on the data distribution. First, algorithms are given to construct memorization networks for an i.i.d. dataset, which have the smallest number of parameters and even a constant number of parameters. Second, we show that, in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
