EGRA:Toward Enhanced Behavior Graphs and Representation Alignment for Multimodal Recommendation
Xiaoxiong Zhang, Xin Zhou, Zhiwei Zeng, Yongjie Wang, Dusit Niyato, Zhiqi Shen

TL;DR
EGRA enhances multimodal recommendation by constructing a more robust behavior graph using pretrained model representations and introduces a dynamic alignment mechanism that adapts during training, leading to significant performance improvements.
Contribution
EGRA introduces a novel graph construction method using pretrained model representations and a bi-level dynamic alignment weighting mechanism for better modality-behavior alignment.
Findings
EGRA outperforms recent methods on five datasets.
The dynamic alignment mechanism improves modality-behavior alignment.
Graph robustness is enhanced by using pretrained model representations.
Abstract
MultiModal Recommendation (MMR) systems have emerged as a promising solution for improving recommendation quality by leveraging rich item-side modality information, prompting a surge of diverse methods. Despite these advances, existing methods still face two critical limitations. First, they use raw modality features to construct item-item links for enriching the behavior graph, while giving limited attention to balancing collaborative and modality-aware semantics or mitigating modality noise in the process. Second, they use a uniform alignment weight across all entities and also maintain a fixed alignment strength throughout training, limiting the effectiveness of modality-behavior alignment. To address these challenges, we propose EGRA. First, instead of relying on raw modality features, it alleviates sparsity by incorporating into the behavior graph an item-item graph built from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
