Language Generation in the Limit: Noise, Loss, and Feedback
Yannan Bai, Debmalya Panigrahi, Ian Zhang

TL;DR
This paper advances the theoretical understanding of language generation in the limit by resolving union-closedness, characterizing noisy and feedback variants, and demonstrating the power of infinite queries.
Contribution
It provides the first negative result on union-closedness, characterizes noisy generation, and analyzes the impact of feedback and infinite queries in language generation models.
Findings
Union of certain generatable collections is not generatable in the limit.
Noisy and non-noisy generation models are equivalent under certain conditions.
Infinite queries in feedback models increase computational power.
Abstract
Kleinberg and Mullainathan (2024) recently proposed a formal framework called language generation in the limit and showed that given a sequence of example strings from an unknown target language drawn from any countable collection, an algorithm can correctly generate unseen strings from the target language within finite time. This notion was further refined by Li, Raman, and Tewari (2024), who defined stricter categories of non-uniform and uniform generation. They showed that a finite union of uniformly generatable collections is generatable in the limit, and asked if the same is true for non-uniform generation. We begin by resolving the question in the negative: we give a uniformly generatable collection and a non-uniformly generatable collection whose union is not generatable in the limit. We then use facets of this construction to further our understanding of several variants of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
