Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data

Bao Pham; Mohammed J. Zaki; Luca Ambrogioni; Dmitry Krotov; Matteo Negri

arXiv:2604.26841·cs.LG·April 30, 2026

Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data

Bao Pham, Mohammed J. Zaki, Luca Ambrogioni, Dmitry Krotov, Matteo Negri

PDF

TL;DR

This paper demonstrates that Uniform-based Discrete Diffusion Models function as associative memories with emergent creative abilities, and introduces a method to detect their memorization and generalization regimes via conditional entropy analysis.

Contribution

It reveals the associative memory behavior of UDDMs and proposes a practical entropy-based metric to distinguish memorization from generalization in generative models.

Findings

01

UDDMs behave as associative memories with basins of attraction.

02

A sharp transition from memorization to generalization is observed as training data size increases.

03

Conditional entropy effectively detects the memorization-generalization transition.

Abstract

When do language diffusion models memorize their training data, and how to quantitatively assess their true generative regime? We address these questions by showing that Uniform-based Discrete Diffusion Models (UDDMs) fundamentally behave as Associative Memories (AMs) $with emergent creative capabilities$ . The core idea of an AM is to reliably recover stored data points as $memories$ by establishing distinct basins of attraction around them. Historically, models like Hopfield networks use an explicit energy function to guarantee these stable attractors. We broaden this perspective by leveraging the observation that energy is not strictly necessary, as basins of attraction can also be formed via conditional likelihood maximization. By evaluating token recovery of $training$ and $test$ examples, we identify in UDDMs a sharp…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.