Attention Mixtures for Time-Aware Sequential Recommendation
Viet-Anh Tran, Guillaume Salha-Galvan, Bruno Sguerra, Romain, Hennequin

TL;DR
This paper introduces MOJITO, a Transformer-based recommender system that uses Gaussian mixture attention to better model user preferences and temporal context, leading to improved prediction accuracy.
Contribution
The paper presents MOJITO, a novel Transformer architecture employing Gaussian mixture attention for enhanced time-aware sequential recommendation.
Findings
Outperforms existing Transformer models on real-world datasets
Effectively models complex dependencies between user preferences and temporal context
Demonstrates significant improvement in recommendation accuracy
Abstract
Transformers emerged as powerful methods for sequential recommendation. However, existing architectures often overlook the complex dependencies between user preferences and the temporal context. In this short paper, we introduce MOJITO, an improved Transformer sequential recommender system that addresses this limitation. MOJITO leverages Gaussian mixtures of attention-based temporal context and item embedding representations for sequential modeling. Such an approach permits to accurately predict which items should be recommended next to users depending on past actions and the temporal context. We demonstrate the relevance of our approach, by empirically outperforming existing Transformers for sequential recommendation on several real-world datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Advanced Graph Neural Networks
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Label Smoothing · Dropout · Residual Connection · Softmax · Byte Pair Encoding
