On the steerability of large language models toward data-driven personas
Junyi Li, Ninareh Mehrabi, Charith Peris, Palash Goyal, Kai-Wei Chang,, Aram Galstyan, Richard Zemel, Rahul Gupta

TL;DR
This paper introduces a data-driven approach to steer large language models toward specific, nuanced personas based on collaborative filtering, improving controllability and diversity of generated viewpoints.
Contribution
It proposes a novel data-driven persona concept and an efficient steering method, significantly enhancing LLMs' ability to generate diverse perspectives.
Findings
Achieved 57-77% improvement in steerability over baselines.
Enabled generation of multiple viewpoints reflecting diverse social groups.
Enhanced control over LLM responses beyond demographic stereotypes.
Abstract
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented. Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs, that can be leveraged to produce multiple perspectives and to reflect the diverse opinions. Moving beyond the traditional reliance on demographics like age, gender, or party affiliation, we introduce a data-driven notion of persona grounded in collaborative filtering, which is defined as either a single individual or a cohort of individuals manifesting similar views across specific inquiries. As individuals in the same demographic group may have different personas, our data-driven persona definition allows for a more nuanced understanding of different (latent) social groups present in the population. In addition to this, we also explore an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersona Design and Applications
MethodsALIGN
