M^2VAE: Multi-Modal Multi-View Variational Autoencoder for Cold-start Item Recommendation
Chuan He, Yongchao Liu, Qiang Li, Wenliang Zhong, Chuntao Hong, Xinwei Yao

TL;DR
M^2VAE is a novel generative model that leverages multi-modal and multi-view features to improve cold-start item recommendation by disentangling shared and unique information and modeling user preferences adaptively.
Contribution
It introduces a multi-view variational autoencoder that explicitly models common and modality-specific features, incorporating contrastive learning and a preference-guided mixture of experts for enhanced cold-start recommendations.
Findings
Outperforms existing methods on real-world datasets.
Effectively disentangles shared and unique feature representations.
Eliminates the need for pretraining through contrastive learning.
Abstract
Cold-start item recommendation is a significant challenge in recommendation systems, particularly when new items are introduced without any historical interaction data. While existing methods leverage multi-modal content to alleviate the cold-start issue, they often neglect the inherent multi-view structure of modalities, the distinction between shared and modality-specific features. In this paper, we propose Multi-Modal Multi-View Variational AutoEncoder (M^2VAE), a generative model that addresses the challenges of modeling common and unique views in attribute and multi-modal features, as well as user preferences over single-typed item features. Specifically, we generate type-specific latent variables for item IDs, categorical attributes, and image features, and use Product-of-Experts (PoE) to derive a common representation. A disentangled contrastive loss decouples the common view…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Sentiment Analysis and Opinion Mining · Multimodal Machine Learning Applications
