Privileged Features Distillation at Taobao Recommendations
Chen Xu, Quan Li, Junfeng Ge, Jinyang Gao, Xiaoyong Yang, Changhua, Pei, Fei Sun, Jian Wu, Hanxiao Sun, and Wenwu Ou

TL;DR
This paper introduces privileged features distillation (PFD), a method that improves recommendation prediction accuracy by transferring knowledge from models using privileged features during training, without relying on them during inference.
Contribution
The paper proposes a novel distillation technique that leverages privileged features during training to enhance model performance in e-commerce recommendations.
Findings
+5.0% click-through rate improvement in CTR task
+2.3% conversion rate improvement in CVR task
Achieves better accuracy without increasing training time
Abstract
Features play an important role in the prediction tasks of e-commerce recommendations. To guarantee the consistency of off-line training and on-line serving, we usually utilize the same features that are both available. However, the consistency in turn neglects some discriminative features. For example, when estimating the conversion rate (CVR), i.e., the probability that a user would purchase the item if she clicked it, features like dwell time on the item detailed page are informative. However, CVR prediction should be conducted for on-line ranking before the click happens. Thus we cannot get such post-event features during serving. We define the features that are discriminative but only available during training as the privileged features. Inspired by the distillation techniques which bridge the gap between training and inference, in this work, we propose privileged features…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Sentiment Analysis and Opinion Mining · Text and Document Classification Technologies
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
