Generalizability of Memorization Neural Networks

Lijia Yu; Xiao-Shan Gao; Lijun Zhang; Yibo Miao

arXiv:2411.00372·cs.LG·November 4, 2024

Generalizability of Memorization Neural Networks

Lijia Yu, Xiao-Shan Gao, Lijun Zhang, Yibo Miao

PDF

Open Access

TL;DR

This paper provides the first theoretical analysis of the generalizability of memorization neural networks, revealing conditions under which they can or cannot generalize based on network width, size, and data distribution.

Contribution

It introduces theoretical frameworks for memorization networks, including minimal parameter construction, conditions for generalizability, and bounds on sample complexity.

Findings

01

Networks must have width at least equal to data dimension for generalization.

02

Optimal parameter networks may not be generalizable.

03

Sample complexity bounds depend on data distribution and network size.

Abstract

The neural network memorization problem is to study the expressive power of neural networks to interpolate a finite dataset. Although memorization is widely believed to have a close relationship with the strong generalizability of deep learning when using over-parameterized models, to the best of our knowledge, there exists no theoretical study on the generalizability of memorization neural networks. In this paper, we give the first theoretical analysis of this topic. Since using i.i.d. training data is a necessary condition for a learning algorithm to be generalizable, memorization and its generalization theory for i.i.d. datasets are developed under mild conditions on the data distribution. First, algorithms are given to construct memorization networks for an i.i.d. dataset, which have the smallest number of parameters and even a constant number of parameters. Second, we show that, in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications