SaFRO: Satisfaction-Aware Fusion via Dual-Relative Policy Optimization for Short-Video Search
Renzhe Zhou, Songyang Li, Feiran Zhu, Chenglei Dai, Yi Zhang, Yi Wang, Jingwei Zhuo

TL;DR
SaFRO is a novel framework that optimizes user satisfaction in short-video search by combining a satisfaction-aware reward model with a dual-relative policy optimization method, improving both ranking quality and user retention.
Contribution
It introduces a satisfaction-aware reward model and a dual-relative policy optimization approach tailored for short-video search, addressing data sparsity and intent constraints.
Findings
Significant improvements in ranking quality and user retention.
Outperforms state-of-the-art baselines in offline and online evaluations.
Effective modeling of inter-objective dependencies enhances context-sensitive fusion.
Abstract
Multi-Task Fusion plays a pivotal role in industrial short-video search systems by aggregating heterogeneous prediction signals into a unified ranking score. However, existing approaches predominantly optimize for immediate engagement metrics, which often fail to align with long-term user satisfaction. While Reinforcement Learning (RL) offers a promising avenue for user satisfaction optimization, its direct application to search scenarios is non-trivial due to the inherent data sparsity and intent constraints compared to recommendation feeds. To this end, we propose SaFRO, a novel framework designed to optimize user satisfaction in short-video search. We first construct a satisfaction-aware reward model that utilizes query-level behavioral proxies to capture holistic user satisfaction beyond item-level interactions. Then we introduce Dual-Relative Policy Optimization (DRPO), an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Multimodal Machine Learning Applications · Information Retrieval and Search Behavior
