Dual-Rerank: Fusing Causality and Utility for Industrial Generative Reranking
Chao Zhang, Shuai Lin, ChengLei Dai, Ye Qian, Fan Mingyang, Yi Zhang, Yi Wang, Jingwei Zhuo

TL;DR
Dual-Rerank introduces a unified reranking framework combining causality and utility modeling, achieving state-of-the-art results in industrial search with improved efficiency and user engagement.
Contribution
It proposes a novel framework that bridges the structural and optimization gaps in generative reranking through knowledge distillation and list-wise RL optimization.
Findings
Significant improvements in user satisfaction and watch time.
Drastic reduction in inference latency.
State-of-the-art performance in production traffic.
Abstract
Kuaishou serves over 400 million daily active users, processing hundreds of millions of search queries daily against a repository of tens of billions of short videos. As the final decision layer, the reranking stage determines user experience by optimizing whole-page utility. While traditional score-and-sort methods fail to capture combinatorial dependencies, Generative Reranking offers a superior paradigm by directly modeling the permutation probability. However, deploying Generative Reranking in such a high-stakes environment faces a fundamental dual dilemma: 1) the structural trade-off where Autoregressive (AR) models offer superior Sequential modeling but suffer from prohibitive latency, versus Non-Autoregressive (NAR) models that enable efficiency but lack dependency capturing; 2) the optimization gap where Supervised Learning faces challenges in directly optimizing whole-page…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
