A Simple Contrastive Framework Of Item Tokenization For Generative Recommendation
Penglong Zhai, Yifang Yuan, Fanyi Di, Jie Li, Yue Liu, Chen Li, Jie Huang, Sicong Wang, Yao Xu, Xin Li

TL;DR
SimCIT introduces a contrastive learning-based item tokenization method that effectively integrates multi-modal item information to improve large-scale generative recommendation systems, overcoming limitations of reconstruction-based approaches.
Contribution
The paper proposes a novel unsupervised contrastive quantization framework, SimCIT, that aligns multi-modal item features with semantic tokens for enhanced generative recommendation.
Findings
Outperforms existing methods on public datasets.
Effective integration of multi-modal knowledge.
Scalable to industrial large-scale datasets.
Abstract
Generative retrieval-based recommendation has emerged as a promising paradigm aiming at directly generating the identifiers of the target candidates. However, in large-scale recommendation systems, this approach becomes increasingly cumbersome due to the redundancy and sheer scale of the token space. To overcome these limitations, recent research has explored the use of semantic tokens as an alternative to ID tokens, which typically leveraged reconstruction-based strategies, like RQ-VAE, to quantize content embeddings and significantly reduce the embedding size. However, reconstructive quantization aims for the precise reconstruction of each item embedding independently, which conflicts with the goal of generative retrieval tasks focusing more on differentiating among items. Moreover, multi-modal side information of items, such as descriptive text and images, geographical knowledge in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Multimodal Machine Learning Applications · Advanced Graph Neural Networks
MethodsContrastive Learning · ALIGN
