POPI: Personalizing LLMs via Optimized Natural Language Preference Inference
Yizhuo Chen, Xin Liu, Ruijie Wang, Zheng Li, Pei Chen, Changlong Yu, Qingyu Yin, Priyanka Nigam, Meng Jiang, Bing Yin

TL;DR
POPI is a novel framework that personalizes large language models by inferring user preferences through natural language summaries, improving personalization and reducing context size across multiple benchmarks.
Contribution
Introduces POPI, a unified preference inference and generation framework that personalizes LLMs using natural language summaries and reinforcement learning.
Findings
Improves personalization quality across four benchmarks.
Reduces context overhead by up to ten times.
Works with black-box commercial APIs.
Abstract
Large language models (LLMs) are typically aligned with population-level preferences, despite substantial variation across individual users. We introduce POPI, a user-level personalization framework that separates the problem into two components connected by a natural-language interface: a shared inference model that distills heterogeneous user signals into a concise preference summary, and a shared generator that conditions on this summary to produce personalized responses. Both components are trained under a unified preference-optimization objective, with reinforcement learning handling the non-differentiable inference step. This objective decomposes into generator approximation error and summary informativeness, revealing how a single loss simultaneously drives accurate generation and informative summarization. Because the interface is natural language, learned summaries can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
