Optimizing Generative Ranking Relevance via Reinforcement Learning in Xiaohongshu Search

Ziyang Zeng; Heming Jing; Jindong Chen; Xiangli Li; Hongyu Liu; Yixuan He; Zhengyu Li; Yige Sun; Zheyong Xie; Yuqing Yang; Shaosheng Cao; Jun Fan; Yi Wu; Yao Hu

arXiv:2512.00968·cs.IR·December 30, 2025

Optimizing Generative Ranking Relevance via Reinforcement Learning in Xiaohongshu Search

Ziyang Zeng, Heming Jing, Jindong Chen, Xiangli Li, Hongyu Liu, Yixuan He, Zhengyu Li, Yige Sun, Zheyong Xie, Yuqing Yang, Shaosheng Cao, Jun Fan, Yi Wu, Yao Hu

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning framework to improve generative relevance models in Xiaohongshu search by grounding reasoning in business-specific criteria, leading to better relevance and business outcomes.

Contribution

It proposes a novel RL-based training method with Stepwise Advantage Masking for relevance modeling, enhancing interpretability and performance in industrial search systems.

Findings

01

Significant improvements in relevance metrics

02

Enhanced robustness and interpretability

03

Effective model distillation for deployment

Abstract

Ranking relevance is a fundamental task in search engines, aiming to identify the items most relevant to a given user query. Traditional relevance models typically produce scalar scores or directly predict relevance labels, limiting both interpretability and the modeling of complex relevance signals. Inspired by recent advances in Chain-of-Thought (CoT) reasoning for complex tasks, we investigate whether explicit reasoning can enhance both interpretability and performance in relevance modeling. However, existing reasoning-based Generative Relevance Models (GRMs) primarily rely on supervised fine-tuning on large amounts of human-annotated or synthetic CoT data, which often leads to limited generalization. Moreover, domain-agnostic, free-form reasoning tends to be overly generic and insufficiently grounded, limiting its potential to handle the diverse and ambiguous cases prevalent in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInformation Retrieval and Search Behavior · Advanced Text Analysis Techniques · Expert finding and Q&A systems