Can Large Language Models Understand Preferences in Personalized   Recommendation?

Zhaoxuan Tan; Zinan Zeng; Qingkai Zeng; Zhenyu Wu; Zheyuan Liu,; Fengran Mo; Meng Jiang

arXiv:2501.13391·cs.CL·January 24, 2025

Can Large Language Models Understand Preferences in Personalized Recommendation?

Zhaoxuan Tan, Zinan Zeng, Qingkai Zeng, Zhenyu Wu, Zheyuan Liu,, Fengran Mo, Meng Jiang

PDF

Open Access 1 Repo

TL;DR

This paper introduces PerRecBench, a new evaluation framework for personalized recommendation that isolates user preferences from rating biases, revealing limitations of current LLM-based methods and highlighting the need for improved preference understanding.

Contribution

The paper proposes PerRecBench to evaluate LLMs on personalized preferences independently of rating biases and compares various ranking approaches and fine-tuning strategies.

Findings

01

Larger LLMs outperform smaller ones but still struggle with personalized preferences.

02

Pairwise and listwise ranking methods outperform pointwise methods.

03

Traditional regression metrics poorly correlate with recommendation quality in this context.

Abstract

Large Language Models (LLMs) excel in various tasks, including personalized recommendations. Existing evaluation methods often focus on rating prediction, relying on regression errors between actual and predicted ratings. However, user rating bias and item quality, two influential factors behind rating scores, can obscure personal preferences in user-item pair data. To address this, we introduce PerRecBench, disassociating the evaluation from these two factors and assessing recommendation techniques on capturing the personal preferences in a grouped ranking manner. We find that the LLM-based recommendation techniques that are generally good at rating prediction fail to identify users' favored and disfavored items when the user rating bias and item quality are eliminated by grouping users. With PerRecBench and 19 LLMs, we find that while larger models generally outperform smaller ones,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tamsiuhin/perrecbench
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Topic Modeling · Mental Health via Writing

MethodsFocus