Aligning LLMs by Predicting Preferences from User Writing Samples
St\'ephane Aroca-Ouellette, Natalie Mackraz, Barry-John Theobald, Katherine Metcalf

TL;DR
This paper presents PROSE, a novel method that improves the inference of personalized human preferences from user writing samples, leading to more aligned and effective LLM-generated outputs.
Contribution
PROSE introduces iterative refinement and verification steps to better capture individual preferences, surpassing previous methods like CIPHER in accuracy.
Findings
PROSE improves preference inference accuracy by 33% over CIPHER.
Combining PROSE with in-context learning yields up to 9% better performance.
PROSE enhances LLM output quality in summarization and email writing tasks.
Abstract
Accommodating human preferences is essential for creating aligned LLM agents that deliver personalized and effective interactions. Recent work has shown the potential for LLMs acting as writing agents to infer a description of user preferences. Agent alignment then comes from conditioning on the inferred preference description. However, existing methods often produce generic preference descriptions that fail to capture the unique and individualized nature of human preferences. This paper introduces PROSE, a method designed to enhance the precision of preference descriptions inferred from user writing samples. PROSE incorporates two key elements: (1) iterative refinement of inferred preferences, and (2) verification of inferred preferences across multiple user writing samples. We evaluate PROSE with several LLMs (i.e., Qwen2.5 7B and 72B Instruct, GPT-mini, and GPT-4o) on a summarization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsData Mining Algorithms and Applications · Semantic Web and Ontologies · Recommender Systems and Techniques
