SteerX: Disentangled Steering for LLM Personalization

Xiaoyan Zhao; Ming Yan; Yilun Qiu; Haoting Ni; Yang Zhang; Fuli Feng; Hong Cheng; Tat-Seng Chua

arXiv:2510.22256·cs.CL·October 28, 2025

SteerX: Disentangled Steering for LLM Personalization

Xiaoyan Zhao, Ming Yan, Yilun Qiu, Haoting Ni, Yang Zhang, Fuli Feng, Hong Cheng, Tat-Seng Chua

PDF

TL;DR

SteerX introduces a disentangled activation steering method for LLM personalization, isolating preference-driven signals to improve the accuracy and effectiveness of user-specific model tuning.

Contribution

The paper proposes SteerX, a novel causal inference-based disentangled steering approach that isolates true user preferences from irrelevant data for better LLM personalization.

Findings

01

SteerX improves steering vector quality across multiple methods.

02

Enhanced personalization results in more accurate LLM responses.

03

Experiments show consistent gains on real-world datasets.

Abstract

Large language models (LLMs) have shown remarkable success in recent years, enabling a wide range of applications, including intelligent assistants that support users' daily life and work. A critical factor in building such assistants is personalizing LLMs, as user preferences and needs vary widely. Activation steering, which directly leverages directions representing user preference in the LLM activation space to adjust its behavior, offers a cost-effective way to align the model's outputs with individual users. However, existing methods rely on all historical data to compute the steering vector, ignoring that not all content reflects true user preferences, which undermines the personalization signal. To address this, we propose SteerX, a disentangled steering method that isolates preference-driven components from preference-agnostic components. Grounded in causal inference theory,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.