Ensemble Knowledge Distillation for CTR Prediction
Jieming Zhu, Jinyang Liu, Weiqi Li, Jincai Lai, Xiuqiang He, Liang, Chen, Zibin Zheng

TL;DR
This paper introduces an ensemble knowledge distillation approach for CTR prediction that simplifies models while improving accuracy, using novel techniques like teacher gating and early stopping, validated through extensive experiments.
Contribution
It proposes a new ensemble knowledge distillation method with techniques like teacher gating and early stopping to enhance CTR prediction accuracy and efficiency.
Findings
Ensemble KD improves CTR prediction accuracy over individual models.
The proposed methods outperform 12 existing models in experiments.
Online and offline tests confirm the effectiveness of the approach.
Abstract
Recently, deep learning-based models have been widely studied for click-through rate (CTR) prediction and lead to improved prediction accuracy in many industrial applications. However, current research focuses primarily on building complex network architectures to better capture sophisticated feature interactions and dynamic user behaviors. The increased model complexity may slow down online inference and hinder its adoption in real-time applications. Instead, our work targets at a new model training strategy based on knowledge distillation (KD). KD is a teacher-student learning framework to transfer knowledge learned from a teacher model to a student model. The KD strategy not only allows us to simplify the student model as a vanilla DNN model but also achieves significant accuracy improvements over the state-of-the-art teacher models. The benefits thus motivate us to further explore…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Recommender Systems and Techniques · Green IT and Sustainability
MethodsKnowledge Distillation · Early Stopping
