CAMMSR: Category-Guided Attentive Mixture of Experts for Multimodal Sequential Recommendation
Jinfeng Xu, Zheyu Chen, Shuo Yang, Jinze Li, Hewei Wang, Yijie Li, Jianheng Tang, Yunhuai Liu, Edith C. H. Ngai

TL;DR
CAMMSR is a novel multimodal recommendation model that adaptively fuses diverse item information using category guidance and contrastive learning, improving personalization and understanding of user preferences.
Contribution
It introduces a category-guided attentive mixture of experts and a modality swap contrastive learning task for dynamic, synergistic multimodal recommendation.
Findings
Outperforms state-of-the-art baselines on four datasets.
Effectively models inter-modal synergies and user preferences.
Enhances cross-modal representation alignment.
Abstract
The explosion of multimedia data in information-rich environments has intensified the challenges of personalized content discovery, positioning recommendation systems as an essential form of passive data management. Multimodal sequential recommendation, which leverages diverse item information such as text and images, has shown great promise in enriching item representations and deepening the understanding of user interests. However, most existing models rely on heuristic fusion strategies that fail to capture the dynamic and context-sensitive nature of user-modal interactions. In real-world scenarios, user preferences for modalities vary not only across individuals but also within the same user across different items or categories. Moreover, the synergistic effects between modalities-where combined signals trigger user interest in ways isolated modalities cannot-remain largely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Recommender Systems and Techniques · Mobile Crowdsensing and Crowdsourcing
