Fundamental Limits of Game-Theoretic LLM Alignment: Smith Consistency and Preference Matching
Zhekun Shi, Kaizhao Liu, Qi Long, Weijie J. Su, Jiancong Xiao

TL;DR
This paper investigates the theoretical limits of game-theoretic approaches to aligning large language models with human preferences, establishing conditions for desirable properties and proving fundamental limitations in preference matching.
Contribution
It provides a rigorous theoretical framework for understanding the robustness and limitations of game-theoretic LLM alignment methods, including necessary conditions and impossibility results.
Findings
Established conditions for Condorcet and Smith consistency.
Proved the impossibility of preference matching with unique Nash equilibrium.
Provided a theoretical foundation for robustness in game-theoretic alignment.
Abstract
Nash Learning from Human Feedback is a game-theoretic framework for aligning large language models (LLMs) with human preferences by modeling learning as a two-player zero-sum game. However, using raw preference as the payoff in the game highly limits the potential of the game-theoretic LLM alignment framework. In this paper, we systematically study using what choices of payoff based on the pairwise human preferences can yield desirable alignment properties. We establish necessary and sufficient conditions for Condorcet consistency, diversity through mixed strategies, and Smith consistency. These results provide a theoretical foundation for the robustness of game-theoretic LLM alignment. Further, we show the impossibility of preference matching -- i.e., no smooth and learnable mappings of pairwise preferences can guarantee a unique Nash equilibrium that matches a target policy, even…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLaw, Economics, and Judicial Systems · Game Theory and Voting Systems · Merger and Competition Analysis
