Is Active Persona Inference Necessary for Aligning Small Models to Personal Preferences?
Zilu Tang, Afra Feyza Aky\"urek, Ekin Aky\"urek, Derry Wijaya

TL;DR
This paper investigates whether actively inferring personal preferences improves the alignment of small language models, demonstrating that active prefixes enhance personalization, reduce biases, and improve model fidelity.
Contribution
It introduces a synthetic dataset and evaluates the impact of active preference inference on small models, showing its benefits over passive methods.
Findings
Active prefixes improve model generalization.
Active alignment reduces biases across attributes.
Active inference enhances personalization fidelity.
Abstract
A prominent issue in aligning language models (LMs) to personalized preferences is underspecification -- the lack of information from users about their preferences. A popular trend of injecting such specification is adding a prefix (e.g. prior relevant conversations) to the current user's conversation to steer preference distribution. Most methods passively model personal preferences with prior example preferences pairs. We ask whether models benefit from actively inferring preference descriptions, and address this question by creating a synthetic personalized alignment dataset based on famous people with known public preferences. We then test how effective finetuned 1-8B size models are at inferring and aligning to personal preferences. Results show that higher-quality active prefixes lead to better generalization, more contextually faithful models, and less systematic biases across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPersona Design and Applications · Innovative Human-Technology Interaction
MethodsFocus · ALIGN
