Mining Stable Preferences: Adaptive Modality Decorrelation for Multimedia Recommendation
Jinghao Zhang, Qiang Liu, Shu Wu, Liang Wang

TL;DR
This paper introduces MODEST, a framework that decorrelates multiple modalities in multimedia data to learn stable user preferences, improving recommendation robustness across data shifts.
Contribution
We propose a novel decorrelation-based learning framework, MODEST, which enhances multimedia recommendation models by reducing spurious correlations among modalities.
Findings
Significant performance improvements on four datasets.
Effective decorrelation of modalities using HSIC.
Compatibility with various recommendation backbones.
Abstract
Multimedia content is of predominance in the modern Web era. In real scenarios, multiple modalities reveal different aspects of item attributes and usually possess different importance to user purchase decisions. However, it is difficult for models to figure out users' true preference towards different modalities since there exists strong statistical correlation between modalities. Even worse, the strong statistical correlation might mislead models to learn the spurious preference towards inconsequential modalities. As a result, when data (modal features) distribution shifts, the learned spurious preference might not guarantee to be as effective on the inference set as on the training set. We propose a novel MOdality DEcorrelating STable learning framework, MODEST for brevity, to learn users' stable preference. Inspired by sample re-weighting techniques, the proposed method aims to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Image and Video Quality Assessment · Sentiment Analysis and Opinion Mining
