Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens

Ting-Ji Huang; Jia-Qi Yang; Chunxu Shen; Kai-Qi Liu; De-Chuan Zhan,; Han-Jia Ye

arXiv:2406.08477·cs.IR·June 13, 2024·1 cites

Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens

Ting-Ji Huang, Jia-Qi Yang, Chunxu Shen, Kai-Qi Liu, De-Chuan Zhan,, Han-Jia Ye

PDF

Open Access

TL;DR

This paper introduces a novel approach to improve LLM-based recommender systems by incorporating out-of-vocabulary tokens that better represent users and items, leading to enhanced recommendation performance.

Contribution

The paper proposes a method to tokenize users and items with OOV tokens in LLMs, improving their ability to distinguish and relate users and items in recommendation tasks.

Findings

01

Outperforms existing methods on multiple recommendation benchmarks.

02

OOV tokens capture user-item correlations and diversity effectively.

03

Clustering representations enhances token sharing among similar users/items.

Abstract

Characterizing users and items through vector representations is crucial for various tasks in recommender systems. Recent approaches attempt to apply Large Language Models (LLMs) in recommendation through a question and answer format, where real users and items (e.g., Item No.2024) are represented with in-vocabulary tokens (e.g., "item", "20", "24"). However, since LLMs are typically pretrained on natural language tasks, these in-vocabulary tokens lack the expressive power for distinctive users and items, thereby weakening the recommendation ability even after fine-tuning on recommendation tasks. In this paper, we explore how to effectively tokenize users and items in LLM-based recommender systems. We emphasize the role of out-of-vocabulary (OOV) tokens in addition to the in-vocabulary ones and claim the memorization of OOV tokens that capture correlations of users/items as well as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies