Memorization With Neural Nets: Going Beyond the Worst Case

Sjoerd Dirksen; Patrick Finke; Martin Genzel

arXiv:2310.00327·stat.ML·December 9, 2024

Memorization With Neural Nets: Going Beyond the Worst Case

Sjoerd Dirksen, Patrick Finke, Martin Genzel

PDF

Open Access 1 Repo

TL;DR

This paper explores how neural networks interpolate training data by introducing an instance-specific approach, providing guarantees based on data geometry rather than worst-case capacity, and demonstrating practical effectiveness on real datasets.

Contribution

It presents a randomized algorithm for constructing interpolating neural networks with guarantees tied to data geometry, moving beyond traditional memorization bounds.

Findings

01

Algorithm constructs interpolating networks in polynomial time.

02

Guarantees depend on geometric properties of data classes.

03

Effective on datasets like MNIST and CIFAR-10.

Abstract

In practice, deep neural networks are often able to easily interpolate their training data. To understand this phenomenon, many works have aimed to quantify the memorization capacity of a neural network architecture: the largest number of points such that the architecture can interpolate any placement of these points with any assignment of labels. For real-world data, however, one intuitively expects the presence of a benign structure so that interpolation already occurs at a smaller network size than suggested by memorization capacity. In this paper, we investigate interpolation by adopting an instance-specific viewpoint. We introduce a simple randomized algorithm that, given a fixed finite data set with two classes, with high probability constructs an interpolating three-layer neural network in polynomial time. The required number of parameters is linked to geometric properties of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

patrickfinke/memo
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Algorithms · Adversarial Robustness in Machine Learning