TL;DR
This paper introduces GTC, a user-aware generative framework for multi-modal recommendation that filters content features based on user preferences and models cross-modal dependencies to improve recommendation accuracy.
Contribution
GTC employs a user-guided diffusion model for personalized content filtering and optimizes total correlation to capture comprehensive cross-modal dependencies, advancing multi-modal recommendation methods.
Findings
GTC outperforms state-of-the-art methods with up to 28.30% NDCG@5 improvement.
Ablation studies confirm the effectiveness of user-aware filtering and total correlation optimization.
Experiments demonstrate GTC's ability to model user-conditional relationships in MMR.
Abstract
Multi-modal recommendation (MMR) enriches item representations by introducing item content, e.g., visual and textual descriptions, to improve upon interaction-only recommenders. The success of MMR hinges on aligning these content modalities with user preferences derived from interaction data, yet dominant practices based on disentangling modality-invariant preference-driving signals from modality-specific preference-irrelevant noises are flawed. First, they assume a one-size-fits-all relevance of item content to user preferences for all users, which contradicts the user-conditional fact of preferences. Second, they optimize pairwise contrastive losses separately toward cross-modal alignment, systematically ignoring higher-order dependencies inherent when multiple content modalities jointly influence user choices. In this paper, we introduce GTC, a conditional Generative Total…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
