Memorisation, convergence and generalisation in generative models
Antoine Maillard, Sebastian Goldt

TL;DR
This paper analytically investigates how generative models transition from memorising training data to generalising, revealing that convergence and latent factor recovery are distinct objectives with different data requirements.
Contribution
It provides an exact analytical characterization of the memorisation-generalisation transition in linear generative models and extends findings to convolutional denoisers and real data.
Findings
Models memorize at low data load
Convergence occurs when sample size is linear in input dimension
Convergence is insensitive to latent factor recovery
Abstract
Generative neural networks learn how to produce highly realistic images from a large, but finite number of examples - or do they simply memorise their training set? To settle this question, Kadkhodaie, Guth, Simoncelli and Mallat (ICLR '24) trained diffusion models independently on disjoint subsets of a dataset and showed that they converge to nearly the same density when the number of training images is large enough. This result raises two basic questions: how much data do you need for convergence, and what does convergence capture about learning the data distribution? Here, we address these questions by providing an exact analytical characterisation of the transition from memorisation to generalisation in linear generative models. We find that these models memorise at small load, while convergence emerges continuously when the number of samples is linear in the input dimension.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
