Structurally Refined Graph Transformer for Multimodal Recommendation
Ke Shi, Yan Zhang, Miao Zhang, Lifan Chen, Jiali Yi, Kui Xiao, Xiaoju Hou, and Zhifei Li

TL;DR
This paper introduces SRGFormer, a novel multimodal recommendation model that leverages hypergraph structures and self-supervised learning to better capture user preferences and improve recommendation accuracy.
Contribution
The paper proposes a structurally refined transformer model that integrates multimodal data into hypergraphs and employs self-supervised tasks to enhance recommendation performance.
Findings
SRGFormer outperforms previous models by 4.47% on the Sports dataset.
Embedding multimodal info into hypergraphs improves local structure learning.
Self-supervised tasks enhance the integration of user-item collaborative signals.
Abstract
Multimodal recommendation systems utilize various types of information, including images and text, to enhance the effectiveness of recommendations. The key challenge is predicting user purchasing behavior from the available data. Current recommendation models prioritize extracting multimodal information while neglecting the distinction between redundant and valuable data. They also rely heavily on a single semantic framework (e.g., local or global semantics), resulting in an incomplete or biased representation of user preferences, particularly those less expressed in prior interactions. Furthermore, these approaches fail to capture the complex interactions between users and items, limiting the model's ability to meet diverse users. To address these challenges, we present SRGFormer, a structurally optimized multimodal recommendation model. By modifying the transformer for better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Multimodal Machine Learning Applications · Advanced Graph Neural Networks
