On the steerability of large language models toward data-driven personas

Junyi Li; Ninareh Mehrabi; Charith Peris; Palash Goyal; Kai-Wei Chang,; Aram Galstyan; Richard Zemel; Rahul Gupta

arXiv:2311.04978·cs.CL·April 4, 2024·2 cites

On the steerability of large language models toward data-driven personas

Junyi Li, Ninareh Mehrabi, Charith Peris, Palash Goyal, Kai-Wei Chang,, Aram Galstyan, Richard Zemel, Rahul Gupta

PDF

Open Access

TL;DR

This paper introduces a data-driven approach to steer large language models toward specific, nuanced personas based on collaborative filtering, improving controllability and diversity of generated viewpoints.

Contribution

It proposes a novel data-driven persona concept and an efficient steering method, significantly enhancing LLMs' ability to generate diverse perspectives.

Findings

01

Achieved 57-77% improvement in steerability over baselines.

02

Enabled generation of multiple viewpoints reflecting diverse social groups.

03

Enhanced control over LLM responses beyond demographic stereotypes.

Abstract

Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented. Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs, that can be leveraged to produce multiple perspectives and to reflect the diverse opinions. Moving beyond the traditional reliance on demographics like age, gender, or party affiliation, we introduce a data-driven notion of persona grounded in collaborative filtering, which is defined as either a single individual or a cohort of individuals manifesting similar views across specific inquiries. As individuals in the same demographic group may have different personas, our data-driven persona definition allows for a more nuanced understanding of different (latent) social groups present in the population. In addition to this, we also explore an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPersona Design and Applications

MethodsALIGN