Exploring Feature-based Knowledge Distillation for Recommender System: A Frequency Perspective
Zhangchi Zhu, Wei Zhang

TL;DR
This paper analyzes feature-based knowledge distillation in recommender systems from a frequency perspective, revealing limitations of equal loss weighting and proposing a reweighting method that improves performance.
Contribution
It introduces a frequency-based analysis of knowledge distillation, identifies the issue of overlooked important knowledge, and proposes FreqD, a lightweight reweighting method that enhances recommendation accuracy.
Findings
FreqD outperforms state-of-the-art methods
Frequency perspective reveals knowledge loss issues
Reweighting improves recommendation performance
Abstract
In this paper, we analyze the feature-based knowledge distillation for recommendation from the frequency perspective. By defining knowledge as different frequency components of the features, we theoretically demonstrate that regular feature-based knowledge distillation is equivalent to equally minimizing losses on all knowledge and further analyze how this equal loss weight allocation method leads to important knowledge being overlooked. In light of this, we propose to emphasize important knowledge by redistributing knowledge weights. Furthermore, we propose FreqD, a lightweight knowledge reweighting method, to avoid the computational cost of calculating losses on each knowledge. Extensive experiments demonstrate that FreqD consistently and significantly outperforms state-of-the-art knowledge distillation methods for recommender systems. Our code is available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques
MethodsKnowledge Distillation
