TL;DR
This paper introduces a multi-view GCN approach for multimedia recommendation that effectively reduces modality noise, models user preferences adaptively, and enhances feature discriminability through separate views and a self-supervised auxiliary task.
Contribution
It proposes a novel multi-view GCN model with modality purification, behavior-aware feature fusion, and a self-supervised task to improve multimedia recommendation accuracy.
Findings
Outperforms existing methods on three public datasets.
Effectively reduces modality noise contamination.
Enhances user preference modeling through adaptive fusion.
Abstract
Multimedia recommendation has received much attention in recent years. It models user preferences based on both behavior information and item multimodal information. Though current GCN-based methods achieve notable success, they suffer from two limitations: (1) Modality noise contamination to the item representations. Existing methods often mix modality features and behavior features in a single view (e.g., user-item view) for propagation, the noise in the modality features may be amplified and coupled with behavior features. In the end, it leads to poor feature discriminability; (2) Incomplete user preference modeling caused by equal treatment of modality features. Users often exhibit distinct modality preferences when purchasing different items. Equally fusing each modality feature ignores the relative importance among different modalities, leading to the suboptimal user preference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
