Deep Deconfounded Content-based Tag Recommendation for UGC with Causal Intervention
Yaochen Zhu, Xubin Ren, Jing Yi, Zhenzhong Chen

TL;DR
This paper introduces DecTag, a deep deconfounded tag recommender system that uses causal intervention and Monte Carlo estimation to eliminate uploader bias, improving the accuracy of content-based tag recommendations for user-generated content.
Contribution
The paper proposes a novel causal graph and a Monte Carlo-based estimator for deconfounding in tag recommendation, addressing biases from uploader preferences.
Findings
DecTag outperforms existing methods in robustness to confounding bias.
The proposed estimator achieves asymptotic unbiasedness under certain assumptions.
A new dataset, YT-8M-Causal, is created for evaluating causal tag recommenders.
Abstract
Traditional content-based tag recommender systems directly learn the association between user-generated content (UGC) and tags based on collected UGC-tag pairs. However, since a UGC uploader simultaneously creates the UGC and selects the corresponding tags, her personal preference inevitably biases the tag selections, which prevents these recommenders from learning the causal influence of UGCs' content features on tags. In this paper, we propose a deep deconfounded content-based tag recommender system, namely, DecTag, to address the above issues. We first establish a causal graph to represent the relations among uploader, UGC, and tag, where the uploaders are identified as confounders that spuriously correlate UGC and tag selections. Specifically, to eliminate the confounding bias, causal intervention is conducted on the UGC node in the graph via backdoor adjustment, where uploaders'…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Sentiment Analysis and Opinion Mining · Text and Document Classification Technologies
