Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering

Jessica Y. Bo; Tianyu Xu; Ishan Chatterjee; Katrina Passarella-Ward; Achin Kulshrestha; and D Shin

arXiv:2505.04260·cs.HC·May 15, 2025

Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering

Jessica Y. Bo, Tianyu Xu, Ishan Chatterjee, Katrina Passarella-Ward, Achin Kulshrestha, and D Shin

PDF

Open Access

TL;DR

This paper introduces a lightweight activation steering method to personalize large language model chatbots, enabling users to align responses with their preferences more effectively and transparently.

Contribution

It presents a novel, user-controlled activation steering technique for LLMs, integrated into chatbots, enhancing personalization without extensive user history or complex memory-based methods.

Findings

01

Preference-based steering effectively aligns responses with user preferences.

02

Users prefer different interfaces based on control, usability, and transparency.

03

Steering improves personalization in real-world chatbot conversations.

Abstract

As large language models (LLMs) improve in their capacity to serve as personal AI assistants, their ability to output uniquely tailored, personalized responses that align with the soft preferences of their users is essential for enhancing user satisfaction and retention. However, untrained lay users have poor prompt specification abilities and often struggle with conveying their latent preferences to AI assistants. To address this, we leverage activation steering to guide LLMs to align with interpretable preference dimensions during inference. In contrast to memory-based personalization methods that require longer user history, steering is extremely lightweight and can be easily controlled by the user via an linear strength factor. We embed steering into three different interactive chatbot interfaces and conduct a within-subjects user study (n=14) to investigate how end users prefer to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)

MethodsALIGN