Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering
Jessica Y. Bo, Tianyu Xu, Ishan Chatterjee, Katrina Passarella-Ward, Achin Kulshrestha, and D Shin

TL;DR
This paper introduces a lightweight activation steering method to personalize large language model chatbots, enabling users to align responses with their preferences more effectively and transparently.
Contribution
It presents a novel, user-controlled activation steering technique for LLMs, integrated into chatbots, enhancing personalization without extensive user history or complex memory-based methods.
Findings
Preference-based steering effectively aligns responses with user preferences.
Users prefer different interfaces based on control, usability, and transparency.
Steering improves personalization in real-world chatbot conversations.
Abstract
As large language models (LLMs) improve in their capacity to serve as personal AI assistants, their ability to output uniquely tailored, personalized responses that align with the soft preferences of their users is essential for enhancing user satisfaction and retention. However, untrained lay users have poor prompt specification abilities and often struggle with conveying their latent preferences to AI assistants. To address this, we leverage activation steering to guide LLMs to align with interpretable preference dimensions during inference. In contrast to memory-based personalization methods that require longer user history, steering is extremely lightweight and can be easily controlled by the user via an linear strength factor. We embed steering into three different interactive chatbot interfaces and conduct a within-subjects user study (n=14) to investigate how end users prefer to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)
MethodsALIGN
