LikeBench: Evaluating Subjective Likability in LLMs for Personalization

Md Awsafur Rahman; Adam Gabrys; Doug Kang; Jingjing Sun; Tian Tan; and Ashwin Chandramouli

arXiv:2512.13077·cs.LG·December 16, 2025

LikeBench: Evaluating Subjective Likability in LLMs for Personalization

Md Awsafur Rahman, Adam Gabrys, Doug Kang, Jingjing Sun, Tian Tan, and Ashwin Chandramouli

PDF

Open Access

TL;DR

LikeBench introduces a comprehensive framework for evaluating the subjective likability of LLMs by measuring their ability to adapt to user preferences across multiple dimensions over extended interactions.

Contribution

The paper presents LikeBench, a novel multi-dimensional, multi-session evaluation framework for likability, incorporating psychologically grounded metrics and realistic user personas.

Findings

01

Strong memory does not guarantee high likability.

02

DeepSeek R1 outperforms Qwen3 in likability despite lower memory accuracy.

03

State-of-the-art models like GPT-5 show limited robustness in long interactions.

Abstract

A personalized LLM should remember user facts, apply them correctly, and adapt over time to provide responses that the user prefers. Existing LLM personalization benchmarks are largely centered on two axes: accurately recalling user information and accurately applying remembered information in downstream tasks. We argue that a third axis, likability, is both subjective and central to user experience, yet under-measured by current benchmarks. To measure likability holistically, we introduce LikeBench, a multi-session, dynamic evaluation framework that measures likability across multiple dimensions by how much an LLM can adapt over time to a user's preferences to provide more likable responses. In LikeBench, the LLMs engage in conversation with a simulated user and learn preferences only from the ongoing dialogue. As the interaction unfolds, models try to adapt to responses, and after…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPersona Design and Applications · Personal Information Management and User Behavior · Recommender Systems and Techniques