Personality Editing for Language Models through Adjusting Self-Referential Queries

Seojin Hwang; Yumin Kim; Byeongjeong Kim; Donghoon Shin; Hwanhee Lee

arXiv:2502.11789·cs.CL·January 22, 2026

Personality Editing for Language Models through Adjusting Self-Referential Queries

Seojin Hwang, Yumin Kim, Byeongjeong Kim, Donghoon Shin, Hwanhee Lee

PDF

Open Access 1 Video

TL;DR

This paper introduces PALETTE, a novel method for editing the personality of large language models using self-referential queries, requiring minimal data and providing stable, balanced personality control.

Contribution

PALETTE enables personality editing in LLMs through adjustment queries grounded in psychological constructs, requiring only 12 samples, unlike traditional fine-tuning methods.

Findings

01

Achieves significant personality alignment with minimal data

02

Outperforms prompt-based approaches in stability and balance

03

Validated by both automatic and human evaluations

Abstract

Large Language Models (LLMs) are integral to applications such as conversational agents and content creation, where precise control over a model's personality is essential for maintaining tone, consistency, and user engagement. However, prevailing prompt-based or fine-tuning approaches either lack robustness or demand large-scale training data, making them costly and impractical. In this paper, we present PALETTE (Personality Adjustment by LLM SElf-TargeTed quEries), a novel method for personality editing in LLMs. Our approach introduces adjustment queries, where self-referential statements grounded in psychological constructs are treated analogously to factual knowledge, enabling direct editing of personality-related responses. Unlike fine-tuning, PALETTE requires only 12 editing samples to achieve substantial improvements in personality alignment across personality dimensions.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Personality Editing for Language Models through Adjusting Self-Referential Queries· underline

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Topic Modeling