Sequential Recommendation via Adaptive Robust Attention with Multi-dimensional Embeddings
Linsey Pang, Amir Hossein Raffiee, Wei Liu, Keld Lundgaard

TL;DR
This paper introduces ADRRec, an advanced sequential recommendation model that enhances robustness and accuracy by integrating adaptive robust attention, multi-dimensional embeddings, and layer-wise noise injection, outperforming existing architectures.
Contribution
It proposes a novel adaptive robust attention mechanism with multi-dimensional embeddings and layer-wise noise injection for improved sequential recommendation.
Findings
Outperforms existing self-attention models in accuracy.
Demonstrates robustness and generalization improvements.
Achieves superior results across extensive experiments.
Abstract
Sequential recommendation models have achieved state-of-the-art performance using self-attention mechanism. It has since been found that moving beyond only using item ID and positional embeddings leads to a significant accuracy boost when predicting the next item. In recent literature, it was reported that a multi-dimensional kernel embedding with temporal contextual kernels to capture users' diverse behavioral patterns results in a substantial performance improvement. In this study, we further improve the sequential recommender model's robustness and generalization by introducing a mix-attention mechanism with a layer-wise noise injection (LNI) regularization. We refer to our proposed model as adaptive robust sequential recommendation framework (ADRRec), and demonstrate through extensive experiments that our model outperforms existing self-attention architectures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques
