Adaptive Multi-Subspace Representation Steering for Attribute Alignment in Large Language Models

Xinyan Jiang; Lin Zhang; Jiayi Zhang; Qingsong Yang; Guimin Hu; Di Wang; Lijie Hu

arXiv:2508.10599·cs.AI·April 28, 2026

Adaptive Multi-Subspace Representation Steering for Attribute Alignment in Large Language Models

Xinyan Jiang, Lin Zhang, Jiayi Zhang, Qingsong Yang, Guimin Hu, Di Wang, Lijie Hu

PDF

1 Models

TL;DR

This paper introduces MSRS, a novel activation steering framework for large language models that effectively manages multiple attributes by isolating their influence in orthogonal subspaces, reducing interference and improving control.

Contribution

MSRS is the first method to allocate orthogonal subspaces for each attribute, combining attribute-specific and shared subspaces with a dynamic weighting for precise multi-attribute steering.

Findings

01

MSRS reduces attribute conflicts compared to existing methods.

02

MSRS outperforms prior approaches across various attributes.

03

MSRS generalizes well to diverse downstream tasks.

Abstract

Activation steering offers a promising approach to controlling the behavior of Large Language Models by directly manipulating their internal activations. However, most existing methods struggle to jointly steer multiple attributes, often resulting in interference and undesirable trade-offs. To address this challenge, we propose Multi-Subspace Representation Steering (MSRS), a novel framework for effective multi-attribute steering via subspace representation fine-tuning. MSRS reduces inter-attribute interference by allocating orthogonal subspaces to each attribute, isolating their influence within the model's representation space. MSRS also incorporates a hybrid subspace composition strategy: it combines attribute-specific subspaces for unique steering directions with a shared subspace for common steering directions. A dynamic weighting function learns to efficiently integrate these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
ryanyen22/reason-first-program
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.