Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation
Zhen Tian, Ting Bai, Zibin Zhang, Zhiyuan Xu, Kangyi Lin, Ji-Rong Wen, and Wayne Xin Zhao

TL;DR
This paper introduces KD-DAGFM, a lightweight model that efficiently learns high-order feature interactions for CTR prediction via knowledge distillation, achieving near-lossless performance with significantly reduced computational cost on industrial-scale data.
Contribution
The paper proposes a novel Directed Acyclic Graph Factorization Machine (DAGFM) that effectively distills high-order feature interactions from complex models, balancing efficiency and accuracy.
Findings
KD-DAGFM achieves less than 21.5% of FLOPs of state-of-the-art methods.
It maintains near-lossless performance in CTR prediction tasks.
Demonstrates superior efficiency on large-scale industrial datasets.
Abstract
With the growth of high-dimensional sparse data in web-scale recommender systems, the computational cost to learn high-order feature interaction in CTR prediction task largely increases, which limits the use of high-order interaction models in real industrial applications. Some recent knowledge distillation based methods transfer knowledge from complex teacher models to shallow student models for accelerating the online model inference. However, they suffer from the degradation of model accuracy in knowledge distillation process. It is challenging to balance the efficiency and effectiveness of the shallow student models. To address this problem, we propose a Directed Acyclic Graph Factorization Machine (KD-DAGFM) to learn the high-order feature interactions from existing complex interaction models for CTR prediction via Knowledge Distillation. The proposed lightweight student model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Graph Neural Networks · Topic Modeling
MethodsKnowledge Distillation
