Relative Advantage Debiasing for Watch-Time Prediction in Short-Video Recommendation

Emily Liu; Kuan Han; Minfeng Zhan; Bocheng Zhao; Guanyu Mu; Yang Song

arXiv:2508.11086·cs.LG·November 26, 2025

Relative Advantage Debiasing for Watch-Time Prediction in Short-Video Recommendation

Emily Liu, Kuan Han, Minfeng Zhan, Bocheng Zhao, Guanyu Mu, Yang Song

PDF

1 Video

TL;DR

This paper introduces a relative advantage debiasing framework for watch-time prediction in short-video recommendation, improving accuracy by correcting watch times for confounding factors using a novel two-stage approach and distributional embeddings.

Contribution

It proposes a new debiasing method that adjusts watch time signals through reference distributions, enhancing recommendation quality and robustness.

Findings

01

Significant improvement in recommendation accuracy.

02

Enhanced robustness against confounding factors.

03

Effective distributional embeddings for quantile parameterization.

Abstract

Watch time is widely used as a proxy for user satisfaction in video recommendation platforms. However, raw watch times are influenced by confounding factors such as video duration, popularity, and individual user behaviors, potentially distorting preference signals and resulting in biased recommendation models. We propose a novel relative advantage debiasing framework that corrects watch time by comparing it to empirically derived reference distributions conditioned on user and item groups. This approach yields a quantile-based preference signal and introduces a two-stage architecture that explicitly separates distribution estimation from preference learning. Additionally, we present distributional embeddings to efficiently parameterize watch-time quantiles without requiring online sampling or storage of historical data. Both offline and online experiments demonstrate significant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Relative Advantage Debiasing for Watch-Time Prediction in Short-Video Recommendation· underline