Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks
Jun Xiao, Hao Ye, Xiangnan He, Hanwang Zhang, Fei Wu, Tat-Seng Chua

TL;DR
This paper introduces Attentional Factorization Machines (AFM), a model that learns the importance of feature interactions using attention networks, improving prediction accuracy over traditional FMs and deep learning methods.
Contribution
It proposes a novel AFM model that discriminates feature interaction importance via neural attention, enhancing FM performance with a simpler, more efficient structure.
Findings
AFM outperforms FM with 8.6% relative improvement in regression tasks.
AFM surpasses state-of-the-art deep models like Wide&Deep and DeepCross.
AFM achieves these results with fewer parameters and simpler architecture.
Abstract
Factorization Machines (FMs) are a supervised learning approach that enhances the linear regression model by incorporating the second-order feature interactions. Despite effectiveness, FM can be hindered by its modelling of all feature interactions with the same weight, as not all feature interactions are equally useful and predictive. For example, the interactions with useless features may even introduce noises and adversely degrade the performance. In this work, we improve FM by discriminating the importance of different feature interactions. We propose a novel model named Attentional Factorization Machine (AFM), which learns the importance of each feature interaction from data via a neural attention network. Extensive experiments on two real-world datasets demonstrate the effectiveness of AFM. Empirically, it is shown on regression task AFM betters FM with a relative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
MethodsWide&Deep · Linear Regression
