Enhancing Item Tokenization for Generative Recommendation through Self-Improvement
Runjin Chen, Mingxuan Ju, Ngoc Bui, Dimosthenis Antypas, Stanley Cai,, Xiaopeng Wu, Leonardo Neves, Zhangyang Wang, Neil Shah, Tong Zhao

TL;DR
This paper introduces a self-improving item tokenization method for generative recommendation systems using LLMs, which refines token representations during training to improve recommendation accuracy.
Contribution
We propose a novel self-improvement approach that allows LLMs to refine item tokenizations during training, aligning them better with the model's internal understanding.
Findings
Achieved an average of 8% improvement in recommendation performance.
Effective across multiple datasets and initial tokenization strategies.
Simple plug-and-play integration into existing systems.
Abstract
Generative recommendation systems, driven by large language models (LLMs), present an innovative approach to predicting user preferences by modeling items as token sequences and generating recommendations in a generative manner. A critical challenge in this approach is the effective tokenization of items, ensuring that they are represented in a form compatible with LLMs. Current item tokenization methods include using text descriptions, numerical strings, or sequences of discrete tokens. While text-based representations integrate seamlessly with LLM tokenization, they are often too lengthy, leading to inefficiencies and complicating accurate generation. Numerical strings, while concise, lack semantic depth and fail to capture meaningful item relationships. Tokenizing items as sequences of newly defined tokens has gained traction, but it often requires external models or algorithms for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Topic Modeling · Intelligent Tutoring Systems and Adaptive Learning
MethodsALIGN
