When Models Don't Collapse: On the Consistency of Iterative MLE

Daniel Barzilai; Ohad Shamir

arXiv:2505.19046·stat.ML·March 27, 2026

When Models Don't Collapse: On the Consistency of Iterative MLE

Daniel Barzilai, Ohad Shamir

PDF

Open Access

TL;DR

This paper provides a theoretical analysis of model collapse in iterative generative modeling using maximum likelihood estimation, showing conditions under which collapse can be avoided or occurs rapidly.

Contribution

It offers the first rigorous theoretical bounds on model collapse in iterative MLE, clarifying when it can be prevented or is inevitable.

Findings

01

Collapse can be avoided with standard assumptions even as real data diminishes.

02

Certain assumptions are necessary; without them, collapse can happen quickly.

03

First rigorous examples of rapid collapse in iterative generative models.

Abstract

The widespread use of generative models has created a feedback loop, in which each generation of models is trained on data partially produced by its predecessors. This process has raised concerns about model collapse: A critical degradation in performance caused by repeated training on synthetic data. However, different analyses in the literature have reached different conclusions as to the severity of model collapse. As such, it remains unclear how concerning this phenomenon is, and under which assumptions it can be avoided. To address this, we theoretically study model collapse for maximum likelihood estimation (MLE), in a natural setting where synthetic data is gradually added to the original data set. Under standard assumptions (similar to those long used for proving asymptotic consistency and normality of MLE), we establish non-asymptotic bounds showing that collapse can be avoided…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation