Lightweight Self-Attentive Sequential Recommendation
Yang Li, Tong Chen, Peng-Fei Zhang, Hongzhi Yin

TL;DR
This paper proposes LSAN, a lightweight self-attentive model for sequential recommendation that reduces memory usage through compositional embeddings and improves efficiency with a twin-attention mechanism, achieving better accuracy and resource utilization.
Contribution
The paper introduces a novel lightweight self-attentive network with compositional embeddings and twin-attention, addressing memory and redundancy issues in sequential recommenders.
Findings
LSAN outperforms existing models in recommendation accuracy.
LSAN significantly reduces memory consumption.
The twin-attention mechanism effectively captures item dependencies.
Abstract
Modern deep neural networks (DNNs) have greatly facilitated the development of sequential recommender systems by achieving state-of-the-art recommendation performance on various sequential recommendation tasks. Given a sequence of interacted items, existing DNN-based sequential recommenders commonly embed each item into a unique vector to support subsequent computations of the user interest. However, due to the potentially large number of items, the over-parameterised item embedding matrix of a sequential recommender has become a memory bottleneck for efficient deployment in resource-constrained environments, e.g., smartphones and other edge devices. Furthermore, we observe that the widely-used multi-head self-attention, though being effective in modelling sequential dependencies among items, heavily relies on redundant attention units to fully capture both global and local item-item…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
