The Alignment Game: A Theory of Long-Horizon Alignment Through Recursive Curation

Ali Falahati; Mohammad Mohammadi Amiri; Kate Larson; Lukasz Golab

arXiv:2511.12804·cs.LG·November 18, 2025

The Alignment Game: A Theory of Long-Horizon Alignment Through Recursive Curation

Ali Falahati, Mohammad Mohammadi Amiri, Kate Larson, Lukasz Golab

PDF

Open Access 1 Video

TL;DR

This paper develops a formal theory of long-term alignment in recursive generative models, analyzing how iterative curation influences model preferences and diversity over time.

Contribution

It introduces a formal framework for understanding recursive alignment, revealing structural regimes and fundamental limitations of BT-based curation mechanisms.

Findings

01

Identifies three convergence regimes: consensus collapse, shared optima, asymmetric refinement.

02

Proves an impossibility theorem: no BT-based mechanism can preserve diversity, symmetry, and independence simultaneously.

03

Shows alignment as an evolving social choice influenced by power and history.

Abstract

In self-consuming generative models that train on their own outputs, alignment with user preferences becomes a recursive rather than one-time process. We provide the first formal foundation for analyzing the long-term effects of such recursive retraining on alignment. Under a two-stage curation mechanism based on the Bradley-Terry (BT) model, we model alignment as an interaction between two factions: the Model Owner, who filters which outputs should be learned by the model, and the Public User, who determines which outputs are ultimately shared and retained through interactions with the model. Our analysis reveals three structural convergence regimes depending on the degree of preference alignment: consensus collapse, compromise on shared optima, and asymmetric refinement. We prove a fundamental impossibility theorem: no recursive BT-based curation mechanism can simultaneously preserve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

The Alignment Game: A Theory of Long-Horizon Alignment Through Recursive Curation· underline

Taxonomy

TopicsAuction Theory and Applications · Multi-Agent Systems and Negotiation · Constraint Satisfaction and Optimization