Attention-guided Multi-step Fusion: A Hierarchical Fusion Network for Multimodal Recommendation
Yan Zhou, Jie Guo, Hao Sun, Bin Song, and Fei Richard Yu

TL;DR
This paper introduces TMFUN, a hierarchical multimodal recommendation model that leverages attention-guided multi-step fusion and contrastive learning to better utilize semantic relations in multimodal data, improving recommendation accuracy.
Contribution
The paper proposes a novel hierarchical fusion network with attention-guided multi-step fusion and graph-based modeling to effectively incorporate semantic relations in multimodal recommendation.
Findings
TMFUN outperforms state-of-the-art models on three real-world datasets.
The model effectively captures semantic relations via graph construction.
Attention-guided fusion improves the integration of multimodal features.
Abstract
The main idea of multimodal recommendation is the rational utilization of the item's multimodal information to improve the recommendation performance. Previous works directly integrate item multimodal features with item ID embeddings, ignoring the inherent semantic relations contained in the multimodal features. In this paper, we propose a novel and effective aTtention-guided Multi-step FUsion Network for multimodal recommendation, named TMFUN. Specifically, our model first constructs modality feature graph and item feature graph to model the latent item-item semantic structures. Then, we use the attention module to identify inherent connections between user-item interaction data and multimodal data, evaluate the impact of multimodal data on different interactions, and achieve early-step fusion of item features. Furthermore, our model optimizes item representation through the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Topic Modeling · Advanced Graph Neural Networks
MethodsContrastive Learning
