Inference-Time Personalized Alignment with a Few User Preference Queries
Victor-Alexandru P\u{a}durean, Parameswaran Kamalaruban, Nachiket Kotalwar, Alkis Gotovos, Adish Singla

TL;DR
This paper introduces UserAlign, an inference-time method for personalized model response alignment using minimal user preference queries, based on a bandit framework, improving personalization efficiency.
Contribution
The paper proposes a novel inference-time personalized alignment method that requires only a few user queries and leverages a bandit-based approach for quick response selection.
Findings
Effective personalization with few queries
Outperforms existing methods in response quality
Applicable to text and image generation tasks
Abstract
We study the problem of aligning a generative model's response with a user's preferences. Recent works have proposed several different formulations for personalized alignment; however, they either require a large amount of user preference queries or require that the preference be explicitly specified as a text input. In this paper, we propose a novel inference-time personalized alignment method, UserAlign, that elicits the user's preferences with a few queries as pairwise response comparisons. In particular, UserAlign builds on the theoretical framework of best-arm identification in logistic bandits and selects a personalized response from a fixed pool of the model's generated responses. The key idea is to consider the user's feedback consistent and noise-free, and incorporate it into the theoretical framework to identify the best response quickly. Experimental results across several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Sentiment Analysis and Opinion Mining · Data Visualization and Analytics
