Personalized Alignment Revisited: The Necessity and Sufficiency of User Diversity
Enoch Hyunwook Kang

TL;DR
This paper establishes the theoretical conditions under which personalized alignment of large language models is statistically efficient, emphasizing the critical role of user diversity.
Contribution
It formally characterizes the necessary and sufficient user-diversity condition for optimal personalized alignment performance.
Findings
Optimal rates depend on user-diversity condition.
Greedy algorithms achieve benchmark efficiency when condition holds.
Without the condition, all learners incur logarithmic regret.
Abstract
Personalized alignment aims to adapt large language models to heterogeneous user preferences, yet the precise theoretical conditions for its statistical efficiency have not been formally established. This paper characterizes the conditions under which personalized alignment achieves O(1) online regret and log(1/epsilon) offline sample complexity. We show that these optimal rates depend on a specific user-diversity condition: the population of user-specific heads must span the latent reward directions that can alter the optimal response. We prove that this condition is both necessary and sufficient. When it holds, simple greedy algorithms achieve benchmark efficiency; when it fails, every learner in a natural admissible class incurs at least logarithmic regret. Our results identify user diversity as the fundamental driver of personalized identifiability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
