LikeBench: Evaluating Subjective Likability in LLMs for Personalization
Md Awsafur Rahman, Adam Gabrys, Doug Kang, Jingjing Sun, Tian Tan, and Ashwin Chandramouli

TL;DR
LikeBench introduces a comprehensive framework for evaluating the subjective likability of LLMs by measuring their ability to adapt to user preferences across multiple dimensions over extended interactions.
Contribution
The paper presents LikeBench, a novel multi-dimensional, multi-session evaluation framework for likability, incorporating psychologically grounded metrics and realistic user personas.
Findings
Strong memory does not guarantee high likability.
DeepSeek R1 outperforms Qwen3 in likability despite lower memory accuracy.
State-of-the-art models like GPT-5 show limited robustness in long interactions.
Abstract
A personalized LLM should remember user facts, apply them correctly, and adapt over time to provide responses that the user prefers. Existing LLM personalization benchmarks are largely centered on two axes: accurately recalling user information and accurately applying remembered information in downstream tasks. We argue that a third axis, likability, is both subjective and central to user experience, yet under-measured by current benchmarks. To measure likability holistically, we introduce LikeBench, a multi-session, dynamic evaluation framework that measures likability across multiple dimensions by how much an LLM can adapt over time to a user's preferences to provide more likable responses. In LikeBench, the LLMs engage in conversation with a simulated user and learn preferences only from the ongoing dialogue. As the interaction unfolds, models try to adapt to responses, and after…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersona Design and Applications · Personal Information Management and User Behavior · Recommender Systems and Techniques
