TL;DR
This paper introduces a Dual Correction strategy for ranking distillation in recommender systems, improving knowledge transfer efficiency by focusing on prediction errors and incorporating user and item-side information, leading to better performance.
Contribution
It proposes a novel Dual Correction strategy that enhances ranking distillation by utilizing prediction discrepancies and addressing sparse feedback in recommender systems.
Findings
Outperforms state-of-the-art baselines in experiments.
Effective correction of student model errors improves ranking accuracy.
Incorporating user and item-side information addresses data sparsity.
Abstract
Knowledge Distillation (KD), which transfers the knowledge of a well-trained large model (teacher) to a small model (student), has become an important area of research for practical deployment of recommender systems. Recently, Relaxed Ranking Distillation (RRD) has shown that distilling the ranking information in the recommendation list significantly improves the performance. However, the method still has limitations in that 1) it does not fully utilize the prediction errors of the student model, which makes the training not fully efficient, and 2) it only distills the user-side ranking information, which provides an insufficient view under the sparse implicit feedback. This paper presents Dual Correction strategy for Distillation (DCD), which transfers the ranking information from the teacher model to the student model in a more efficient manner. Most importantly, DCD uses the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
