PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra

Xiachong Feng; Liang Zhao; Weihong Zhong; Yichong Huang; Yuxuan Gu; Lingpeng Kong; Xiaocheng Feng; Bing Qin

arXiv:2602.15669·cs.AI·February 18, 2026

PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra

Xiachong Feng, Liang Zhao, Weihong Zhong, Yichong Huang, Yuxuan Gu, Lingpeng Kong, Xiaocheng Feng, Bing Qin

PDF

Open Access 3 Reviews

TL;DR

PERSONA introduces a training-free method for dynamic personality control in large language models by manipulating personality vectors in activation space, enabling fine-grained, compositional, and context-aware personality adjustments.

Contribution

It presents a novel framework that extracts and manipulates personality traits as orthogonal vectors in activation space, avoiding costly fine-tuning and enabling dynamic personality adaptation.

Findings

01

Achieves near-supervised performance on PersonalityBench without training.

02

Demonstrates effective dynamic personality adaptation with 91% win rates.

03

Shows that personality traits are mathematically tractable in model representations.

Abstract

Current methods for personality control in Large Language Models rely on static prompting or expensive fine-tuning, failing to capture the dynamic and compositional nature of human traits. We introduce PERSONA, a training-free framework that achieves fine-tuning level performance through direct manipulation of personality vectors in activation space. Our key insight is that personality traits appear as extractable, approximately orthogonal directions in the model's representation space that support algebraic operations. The framework operates through three stages: Persona-Base extracts orthogonal trait vectors via contrastive activation analysis; Persona-Algebra enables precise control through vector arithmetic (scalar multiplication for intensity, addition for composition, subtraction for suppression); and Persona-Flow achieves context-aware adaptation by dynamically composing these…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 3

Strengths

The paper focuses on personality of LLM which is one important area and gives new method to use LLM. The paper is well writing. The paper is easy to follow.

Weaknesses

1. The architecture is training-free while it still needs LLM to generate personality vector which is related to the training data of LLM and makes the performance unclear. 2. The paper gives three different personality vectors while there is no description about the method to define these three personality. 3. The figure in this paper should be larger.

Reviewer 02Rating 4Confidence 4

Strengths

* This paper is well-motivated to reframe personality control as a problem of vector manipulation in activation space, providing a new geometric and interpretable perspective distinct from prompt engineering or fine-tuning. * PERSONA-FLOW's ability to modulate personality adaptively during inference is promising on controllable LLMs. It shows that behavioral alignment can be achieved through lightweight inference-time adjustments rather than parameter updates. * The proposed PERSONA-EVOLVE bench

Weaknesses

* Intuitively, personality-related features should be distributed across multiple layers of LLMs rather than concentrated in a single one. However, PERSONA-BASE depends on selecting the most effective layer without providing empirical justification. Further analysis and ablation experiments are needed to verify whether personality information is indeed localized within a particular layer. * The improvements over NPTI appear marginal overall and are mostly confined to the Openness trait. It is th

Reviewer 03Rating 4Confidence 4

Strengths

The paper presents a well-structured framework with clear components (PERSONA-BASE, PERSONA-ALGEBRA, PERSONA-FLOW) that build upon each other logically. The empirical validation is thorough, testing on both external benchmarks and their custom PERSONA-EVOLVE dataset across multiple model architectures. The approach demonstrates impressive performance, matching fine-tuning results without requiring gradient updates, and the algebraic operations on personality vectors are well-validated through sy

Weaknesses

The paper lacks detailed analysis of failure cases or limitations of the approach, particularly in scenarios where personality traits might conflict. The extraction methodology relies heavily on GPT-4.1-mini for evaluation, which could introduce biases in how traits are defined and measured. Additionally, while the authors claim orthogonality of personality vectors, Figure 2 shows significant correlations between certain traits across dimensions, suggesting the extracted vectors aren't truly ort

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPersona Design and Applications · Personality Traits and Psychology · Topic Modeling