Learning ID-free Item Representation with Token Crossing for Multimodal   Recommendation

Kangning Zhang; Jiarui Jin; Yingjie Qin; Ruilong Su; Jianghao Lin,; Yong Yu; Weinan Zhang

arXiv:2410.19276·cs.IR·October 28, 2024

Learning ID-free Item Representation with Token Crossing for Multimodal Recommendation

Kangning Zhang, Jiarui Jin, Yingjie Qin, Ruilong Su, Jianghao Lin,, Yong Yu, Weinan Zhang

PDF

Open Access

TL;DR

This paper introduces MOTOR, an ID-free multimodal item representation scheme that uses token crossing and product quantization to improve recommendation performance and reduce space requirements in multimodal recommender systems.

Contribution

MOTOR replaces ID embeddings with learnable multimodal tokens and a token crossing network, enabling ID-free recommendation and better information exchange among items.

Findings

01

Significant performance improvements on nine models.

02

Reduced space requirements for item representations.

03

Enhanced information interaction among related items.

Abstract

Current multimodal recommendation models have extensively explored the effective utilization of multimodal information; however, their reliance on ID embeddings remains a performance bottleneck. Even with the assistance of multimodal information, optimizing ID embeddings remains challenging for ID-based Multimodal Recommender when interaction data is sparse. Furthermore, the unique nature of item-specific ID embeddings hinders the information exchange among related items and the spatial requirement of ID embeddings increases with the scale of item. Based on these limitations, we propose an ID-free MultimOdal TOken Representation scheme named MOTOR that represents each item using learnable multimodal tokens and connects them through shared tokens. Specifically, we first employ product quantization to discretize each item's multimodal features (e.g., images, text) into discrete token IDs.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Topic Modeling · Text and Document Classification Technologies