Learning ID-free Item Representation with Token Crossing for Multimodal Recommendation
Kangning Zhang, Jiarui Jin, Yingjie Qin, Ruilong Su, Jianghao Lin,, Yong Yu, Weinan Zhang

TL;DR
This paper introduces MOTOR, an ID-free multimodal item representation scheme that uses token crossing and product quantization to improve recommendation performance and reduce space requirements in multimodal recommender systems.
Contribution
MOTOR replaces ID embeddings with learnable multimodal tokens and a token crossing network, enabling ID-free recommendation and better information exchange among items.
Findings
Significant performance improvements on nine models.
Reduced space requirements for item representations.
Enhanced information interaction among related items.
Abstract
Current multimodal recommendation models have extensively explored the effective utilization of multimodal information; however, their reliance on ID embeddings remains a performance bottleneck. Even with the assistance of multimodal information, optimizing ID embeddings remains challenging for ID-based Multimodal Recommender when interaction data is sparse. Furthermore, the unique nature of item-specific ID embeddings hinders the information exchange among related items and the spatial requirement of ID embeddings increases with the scale of item. Based on these limitations, we propose an ID-free MultimOdal TOken Representation scheme named MOTOR that represents each item using learnable multimodal tokens and connects them through shared tokens. Specifically, we first employ product quantization to discretize each item's multimodal features (e.g., images, text) into discrete token IDs.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Topic Modeling · Text and Document Classification Technologies
