Efficient On-Device Session-Based Recommendation

Xin Xia; Junliang Yu; Qinyong Wang; Chaoqun Yang; Quoc Viet Hung; Nguyen; Hongzhi Yin

arXiv:2209.13422·cs.IR·January 9, 2023·1 cites

Efficient On-Device Session-Based Recommendation

Xin Xia, Junliang Yu, Qinyong Wang, Chaoqun Yang, Quoc Viet Hung, Nguyen, Hongzhi Yin

PDF

Open Access 1 Repo

TL;DR

This paper introduces a compositional encoding approach for on-device session-based recommendation systems, significantly improving inference speed and model compression while maintaining high recommendation accuracy.

Contribution

It proposes a novel compositional encoding method combined with self-supervised knowledge distillation to enhance efficiency and performance of resource-constrained on-device recommenders.

Findings

01

8x inference speedup over existing methods

02

Large model compression with maintained accuracy

03

Superior recommendation performance on benchmark datasets

Abstract

On-device session-based recommendation systems have been achieving increasing attention on account of the low energy/resource consumption and privacy protection while providing promising recommendation performance. To fit the powerful neural session-based recommendation models in resource-constrained mobile devices, tensor-train decomposition and its variants have been widely applied to reduce memory footprint by decomposing the embedding table into smaller tensors, showing great potential in compressing recommendation models. However, these model compression techniques significantly increase the local inference time due to the complex process of generating index lists and a series of tensor multiplications to form item embeddings, and the resultant on-device recommender fails to provide real-time response and recommendation. To improve the online recommendation efficiency, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaxin1998/eodrec
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Tensor decomposition and applications · Caching and Content Delivery

MethodsKnowledge Distillation