Understanding the Role of Cross-Entropy Loss in Fairly Evaluating Large   Language Model-based Recommendation

Cong Xu; Zhangchi Zhu; Jun Wang; Jianyong Wang; Wei Zhang

arXiv:2402.06216·cs.IR·February 23, 2024·1 cites

Understanding the Role of Cross-Entropy Loss in Fairly Evaluating Large Language Model-based Recommendation

Cong Xu, Zhangchi Zhu, Jun Wang, Jianyong Wang, Wei Zhang

PDF

Open Access

TL;DR

This paper critically evaluates the effectiveness of large language models in recommendation tasks, revealing that their perceived superiority is often overstated due to unfair comparison methods and highlighting the importance of proper evaluation standards.

Contribution

It provides a theoretical justification for using cross-entropy loss and demonstrates that alternative approximations can be effective, challenging previous claims of LLMs' dominance in recommendation.

Findings

01

Cross-entropy loss is theoretically superior for recommendation.

02

Existing LLM-based methods are less effective than previously claimed.

03

Proper evaluation reveals traditional methods can perform competitively.

Abstract

Large language models (LLMs) have gained much attention in the recommendation community; some studies have observed that LLMs, fine-tuned by the cross-entropy loss with a full softmax, could achieve state-of-the-art performance already. However, these claims are drawn from unobjective and unfair comparisons. In view of the substantial quantity of items in reality, conventional recommenders typically adopt a pointwise/pairwise loss function instead for training. This substitute however causes severe performance degradation, leading to under-estimation of conventional methods and over-confidence in the ranking capability of LLMs. In this work, we theoretically justify the superiority of cross-entropy, and showcase that it can be adequately replaced by some elementary approximations with certain necessary modifications. The remarkable results across three public datasets corroborate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling