TL;DR
This paper introduces an enhanced doubly robust estimator for post-click conversion rate estimation that reduces variance and improves debiasing in recommender systems by employing a novel double learning approach.
Contribution
It proposes a new MRDR estimator with lower variance and a double learning scheme that better approximates the gradient for CVR estimation.
Findings
The proposed method outperforms state-of-the-art approaches on real datasets.
The double learning scheme effectively reduces variance in error imputation.
Extensive experiments validate the superiority of the approach.
Abstract
Post-click conversion, as a strong signal indicating the user preference, is salutary for building recommender systems. However, accurately estimating the post-click conversion rate (CVR) is challenging due to the selection bias, i.e., the observed clicked events usually happen on users' preferred items. Currently, most existing methods utilize counterfactual learning to debias recommender systems. Among them, the doubly robust (DR) estimator has achieved competitive performance by combining the error imputation based (EIB) estimator and the inverse propensity score (IPS) estimator in a doubly robust way. However, inaccurate error imputation may result in its higher variance than the IPS estimator. Worse still, existing methods typically use simple model-agnostic methods to estimate the imputation error, which are not sufficient to approximate the dynamically changing model-correlated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
