SynthesizeMe! Inducing Persona-Guided Prompts for Personalized Reward Models in LLMs
Michael J Ryan, Omar Shaikh, Aditri Bhagirath, Daniel Frees, William Held, Diyi Yang

TL;DR
SynthesizeMe introduces a method to generate personalized prompts for reward models in LLMs by synthesizing user personas from interactions, improving personalized judgment accuracy and performance on user-curated benchmarks.
Contribution
The paper presents SynthesizeMe, a novel approach to induce synthetic user personas from interactions, enhancing personalized reward modeling without relying on explicit demographic data.
Findings
Improves personalized LLM judgment accuracy by 4.4%.
Achieves top performance on PersonalRewardBench.
Effectively synthesizes user personas from interaction data.
Abstract
Recent calls for pluralistic alignment of Large Language Models (LLMs) encourage adapting models to diverse user preferences. However, most prior work on personalized reward models heavily rely on additional identity information, such as demographic details or a predefined set of preference categories. To this end, we introduce SynthesizeMe, an approach to inducing synthetic user personas from user interactions for personalized reward modeling. SynthesizeMe first generates and verifies reasoning to explain user preferences, then induces synthetic user personas from that reasoning, and finally filters to informative prior user interactions in order to build personalized prompts for a particular user. We show that using SynthesizeMe induced prompts improves personalized LLM-as-a-judge accuracy by 4.4% on Chatbot Arena. Combining SynthesizeMe derived prompts with a reward model achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersona Design and Applications · AI in Service Interactions · Machine Learning in Healthcare
