Controlling Chat Style in Language Models via Single-Direction Editing

Zhenyu Xu; Victor S. Sheng

arXiv:2603.03324·cs.CL·March 5, 2026

Controlling Chat Style in Language Models via Single-Direction Editing

Zhenyu Xu, Victor S. Sheng

PDF

Open Access

TL;DR

This paper demonstrates that stylistic attributes in large language models can be controlled through linear directions in activation space, enabling precise, training-free style editing with minimal computational overhead.

Contribution

It provides empirical evidence that style attributes are linearly encoded and introduces a lightweight, training-free method for style control in LLMs.

Findings

01

High style adherence achieved across multiple models

02

Supports linear style composition

03

Enhances safety by removing undesirable behaviors

Abstract

Controlling stylistic attributes in large language models (LLMs) remains challenging, with existing approaches relying on either prompt engineering or post-training alignment. This paper investigates this challenge through the lens of representation engineering, testing the hypothesis that distinct stylistic attributes - from emotional tone to linguistic structure - are encoded as linear directions in the model's activation space. We provide strong empirical evidence for this hypothesis across a wide range of styles and, based on this finding, present a lightweight, training-free method for precise style control. Our approach supports linear style composition, enhances safety by ablating undesirable behaviors, and, as confirmed by experiments on over a dozen models, achieves high style adherence while preserving core capabilities at minimal computational cost.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Mental Health via Writing · Authorship Attribution and Profiling