DE-RRD: A Knowledge Distillation Framework for Recommender System

SeongKu Kang; Junyoung Hwang; Wonbin Kweon; Hwanjo Yu

arXiv:2012.04357·cs.LG·December 9, 2020

DE-RRD: A Knowledge Distillation Framework for Recommender System

SeongKu Kang, Junyoung Hwang, Wonbin Kweon, Hwanjo Yu

PDF

2 Repos

TL;DR

DE-RRD is a novel knowledge distillation framework for recommender systems that transfers both latent knowledge and prediction-based knowledge from teacher to student, improving performance and inference speed.

Contribution

The paper introduces DE-RRD, combining latent knowledge transfer via experts and relaxed ranking distillation, advancing model compression techniques for recommender systems.

Findings

01

Outperforms state-of-the-art distillation methods

02

Achieves comparable or better performance than teacher models

03

Reduces inference latency significantly

Abstract

Recent recommender systems have started to employ knowledge distillation, which is a model compression technique distilling knowledge from a cumbersome model (teacher) to a compact model (student), to reduce inference latency while maintaining performance. The state-of-the-art methods have only focused on making the student model accurately imitate the predictions of the teacher model. They have a limitation in that the prediction results incompletely reveal the teacher's knowledge. In this paper, we propose a novel knowledge distillation framework for recommender system, called DE-RRD, which enables the student model to learn from the latent knowledge encoded in the teacher model as well as from the teacher's predictions. Concretely, DE-RRD consists of two methods: 1) Distillation Experts (DE) that directly transfers the latent knowledge from the teacher model. DE exploits "experts"…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKnowledge Distillation