Editing Personality for Large Language Models
Shengyu Mao, Xiaohan Wang, Mengru Wang, Yong Jiang, Pengjun Xie, Fei, Huang, Ningyu Zhang

TL;DR
This paper proposes a new task and dataset for editing the personality traits of Large Language Models, enabling controlled personality expression in responses based on social psychology theory.
Contribution
It introduces the PersonalityEdit benchmark dataset and explores methods for editing LLMs' personality traits aligned with social psychology concepts.
Findings
Identified challenges in editing personality traits of LLMs
Demonstrated the feasibility of trait-specific response generation
Highlighted remaining issues in personality editing
Abstract
This paper introduces an innovative task focused on editing the personality traits of Large Language Models (LLMs). This task seeks to adjust the models' responses to opinion-related questions on specified topics since an individual's personality often manifests in the form of their expressed opinions, thereby showcasing different personality traits. Specifically, we construct PersonalityEdit, a new benchmark dataset to address this task. Drawing on the theory in Social Psychology, we isolate three representative traits, namely Neuroticism, Extraversion, and Agreeableness, as the foundation for our benchmark. We then gather data using GPT-4, generating responses that align with a specified topic and embody the targeted personality trait. We conduct comprehensive experiments involving various baselines and discuss the representation of personality behavior in LLMs. Our findings uncover…
Peer Reviews
Decision·Submitted to ICLR 2024
The research direction of personality editing presents a compelling and possibly influential field of study. The authors have developed a corresponding dataset through the data generation capabilities of GPT-4. Additionally, they offer a series of insightful experiments employing various baseline models within the task of personality editing. These models are benchmarked against the dataset, providing valuable findings that enhance our understanding of the subject.
The paper's exploration of personality editing in language models is certainly an intriguing endeavor, but there are aspects that invite scrutiny regarding its novelty and significance: The primary contribution of the paper is the introduction of the PersonalityEdit dataset, benchmarking established methods in the context of this new dataset. However, the paper could benefit from a more detailed analysis of the dataset construction process, particularly since the dataset is generated by prompti
- The paper addresses a unique and intriguing topic - adjusting the personality of LLMs. This is a fresh perspective on the capabilities of LLMs beyond their usual tasks. - The paper employs a combination of knowledge editing and a scoring system to evaluate the alignment of LLM responses with specific personality traits. T - The authors have acknowledged the potential biases in the pre-training corpus and the possibility of eliciting offensive or discriminatory content. This shows a responsibl
- Lack of significance tests - The manuscript needs reorganization since many important points are in the Appendix - Partial assessment of personality traits
- The paper introduced a new automatically generated dataset aligned with 3 of the big five personality types, which can be used to understand how LLMs interpret the personality types. - The data generation process is simple and easy to follow. - They evaluate the collected dataset against existing baseline and highlights the challenges.
- Motivation for the task is quite weak. I am not convinced that the task is novel. It is probably style transfer or conditional text generation by prompting LLMs, where personality defines the style of text. While they contrast it with style transfer, the justification on how it is different is not well supported. - While the paper is easy to follow given the simplicity of the task, it fails to give you a comprehensive overview of the properties of the dataset. Minor: - In abstract the last
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Law, AI, and Intellectual Property · Library Science and Information Systems
MethodsMulti-Head Attention · Attention Is All You Need · Dropout · Dense Connections · Linear Layer · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection · Layer Normalization
