Attention-based Multimodal Feature Representation Model for Micro-video Recommendation
Mohan Hasama, Jing Li

TL;DR
This paper introduces an attention-based multimodal feature representation model for micro-video recommendation that leverages self-attention to capture feature correlations and importance, improving recommendation accuracy.
Contribution
It proposes a novel model combining multi-headed self-attention and external cross-representation learning to better capture feature relationships in micro-video recommendation.
Findings
Enhanced feature correlation modeling improves recommendation accuracy
Self-attention mechanism effectively captures internal feature importance
External cross-representation enriches feature descriptions
Abstract
In recommender systems, models mostly use a combination of embedding layers and multilayer feedforward neural networks. The high-dimensional sparse original features are downscaled in the embedding layer and then fed into the fully connected network to obtain prediction results. However, the above methods have a rather obvious problem, that is, the features directly input are treated as independent individuals, and in fact there are internal correlations between features and features, and even different features have different importance in the recommendation. In this regard, this paper adopts a self-attentive mechanism to mine the internal correlations between features as well as their relative importance. In recent years, as a special form of attention mechanism, self-attention mechanism is favored by many researchers. The self-attentive mechanism captures the internal correlation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Technologies in Various Fields · Image Retrieval and Classification Techniques
