Persona Vectors in Games: Measuring and Steering Strategies via Activation Vectors

Johnathan Sun; Andrew Zhang

arXiv:2603.21398·cs.AI·March 24, 2026

Persona Vectors in Games: Measuring and Steering Strategies via Activation Vectors

Johnathan Sun, Andrew Zhang

PDF

Open Access

TL;DR

This paper introduces persona vectors derived from activation steering in large language models to understand and influence high-level strategic behaviors in game-theoretic settings, revealing systematic shifts and divergences in rhetoric and strategy.

Contribution

It presents a novel method for constructing and applying persona vectors in LLMs to measure and steer strategic traits in game environments, advancing interpretability and control.

Findings

01

Activation steering shifts strategic choices and justifications systematically.

02

Rhetoric and strategy can diverge under persona steering.

03

Self-behavior and expectations vectors are partially distinct.

Abstract

Large language models (LLMs) are increasingly deployed as autonomous decision-makers in strategic settings, yet we have limited tools for understanding their high-level behavioral traits. We use activation steering methods in game-theoretic settings, constructing persona vectors for altruism, forgiveness, and expectations of others by contrastive activation addition. Evaluating on canonical games, we find that activation steering systematically shifts both quantitative strategic choices and natural-language justifications. However, we also observe that rhetoric and strategy can diverge under steering. In addition, vectors for self-behavior and expectations of others are partially distinct. Our results suggest that persona vectors offer a promising mechanistic handle on high-level traits in strategic environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPersona Design and Applications · Artificial Intelligence in Law · AI in Service Interactions