ControlLM: Crafting Diverse Personalities for Language Models

Yixuan Weng; Shizhu He; Kang Liu; Shengping Liu; Jun Zhao

arXiv:2402.10151·cs.CL·February 16, 2024·2 cites

ControlLM: Crafting Diverse Personalities for Language Models

Yixuan Weng, Shizhu He, Kang Liu, Shengping Liu, Jun Zhao

PDF

Open Access 1 Repo

TL;DR

ControlLM enables real-time, inference-time control of language model personalities, allowing for diverse, human-like behaviors and improved task performance without additional training.

Contribution

This work introduces ControlLM, a novel method leveraging differential activation patterns to control language model personalities at inference time.

Findings

01

ControlLM can elicit diverse persona behaviors without training.

02

It allows precise personality control matching human values.

03

Enhanced reasoning and question answering through attribute amplification.

Abstract

As language models continue to scale in size and capability, they display an array of emerging behaviors, both beneficial and concerning. This heightens the need to control model behaviors. We hope to be able to control the personality traits of language models at the inference-time so as to have various character features, on top of which the requirements of different types of tasks can be met. Personality is a higher-level and more abstract behavioral representation for language models. We introduce ControlLM, which leverages differential activation patterns, derived from contrasting behavioral prompts in the model's latent space, to influence the model's personality traits at inference. This approach allows for the precise, real-time adjustment of model behavior. First, we demonstrate ControlLM's capacity to elicit diverse persona behaviors without any training, while precision…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wengsyx/controllm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling