Efficient On-Device Session-Based Recommendation
Xin Xia, Junliang Yu, Qinyong Wang, Chaoqun Yang, Quoc Viet Hung, Nguyen, Hongzhi Yin

TL;DR
This paper introduces a compositional encoding approach for on-device session-based recommendation systems, significantly improving inference speed and model compression while maintaining high recommendation accuracy.
Contribution
It proposes a novel compositional encoding method combined with self-supervised knowledge distillation to enhance efficiency and performance of resource-constrained on-device recommenders.
Findings
8x inference speedup over existing methods
Large model compression with maintained accuracy
Superior recommendation performance on benchmark datasets
Abstract
On-device session-based recommendation systems have been achieving increasing attention on account of the low energy/resource consumption and privacy protection while providing promising recommendation performance. To fit the powerful neural session-based recommendation models in resource-constrained mobile devices, tensor-train decomposition and its variants have been widely applied to reduce memory footprint by decomposing the embedding table into smaller tensors, showing great potential in compressing recommendation models. However, these model compression techniques significantly increase the local inference time due to the complex process of generating index lists and a series of tensor multiplications to form item embeddings, and the resultant on-device recommender fails to provide real-time response and recommendation. To improve the online recommendation efficiency, we propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Tensor decomposition and applications · Caching and Content Delivery
MethodsKnowledge Distillation
