Enhancing Item Tokenization for Generative Recommendation through   Self-Improvement

Runjin Chen; Mingxuan Ju; Ngoc Bui; Dimosthenis Antypas; Stanley Cai,; Xiaopeng Wu; Leonardo Neves; Zhangyang Wang; Neil Shah; Tong Zhao

arXiv:2412.17171·cs.LG·December 24, 2024

Enhancing Item Tokenization for Generative Recommendation through Self-Improvement

Runjin Chen, Mingxuan Ju, Ngoc Bui, Dimosthenis Antypas, Stanley Cai,, Xiaopeng Wu, Leonardo Neves, Zhangyang Wang, Neil Shah, Tong Zhao

PDF

Open Access

TL;DR

This paper introduces a self-improving item tokenization method for generative recommendation systems using LLMs, which refines token representations during training to improve recommendation accuracy.

Contribution

We propose a novel self-improvement approach that allows LLMs to refine item tokenizations during training, aligning them better with the model's internal understanding.

Findings

01

Achieved an average of 8% improvement in recommendation performance.

02

Effective across multiple datasets and initial tokenization strategies.

03

Simple plug-and-play integration into existing systems.

Abstract

Generative recommendation systems, driven by large language models (LLMs), present an innovative approach to predicting user preferences by modeling items as token sequences and generating recommendations in a generative manner. A critical challenge in this approach is the effective tokenization of items, ensuring that they are represented in a form compatible with LLMs. Current item tokenization methods include using text descriptions, numerical strings, or sequences of discrete tokens. While text-based representations integrate seamlessly with LLM tokenization, they are often too lengthy, leading to inefficiencies and complicating accurate generation. Numerical strings, while concise, lack semantic depth and fail to capture meaningful item relationships. Tokenizing items as sequences of newly defined tokens has gained traction, but it often requires external models or algorithms for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Topic Modeling · Intelligent Tutoring Systems and Adaptive Learning

MethodsALIGN