CLEAR: Null-Space Projection for Cross-Modal De-Redundancy in Multimodal Recommendation
Hao Zhan, Yihui Wang, Yonghui Yang, Danyang Yue, Yu Wang, Pengyang Shao, Fei Shen, Fei Liu, Le Wu

TL;DR
CLEAR is a novel method that reduces cross-modal redundancy in multimodal recommendation systems by projecting features onto a null space, improving recommendation accuracy without altering existing models.
Contribution
The paper introduces a lightweight, plug-and-play approach called CLEAR that explicitly characterizes and suppresses redundant shared components across modalities in recommendation systems.
Findings
Consistently improves recommendation performance across benchmarks.
Effectively reduces cross-modal redundancy without modifying existing architectures.
Enhances utilization of complementary multimodal information.
Abstract
Multimodal recommendation has emerged as an effective paradigm for enhancing collaborative filtering by incorporating heterogeneous content modalities. Existing multimodal recommenders predominantly focus on reinforcing cross-modal consistency to facilitate multimodal fusion. However, we observe that multimodal representations often exhibit substantial cross-modal redundancy, where dominant shared components overlap across modalities. Such redundancy can limit the effective utilization of complementary information, explaining why incorporating additional modalities does not always yield performance improvements. In this work, we propose CLEAR, a lightweight and plug-and-play cross-modal de-redundancy approach for multimodal recommendation. Rather than enforcing stronger cross-modal alignment, CLEAR explicitly characterizes the redundant shared subspace across modalities by modeling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Multimodal Machine Learning Applications · Emotion and Mood Recognition
