Reinforcement Learning to Rank Using Coarse-grained Rewards

Yiteng Tu; Zhichao Xu; Tao Yang; Weihang Su; Yujia Zhou; Yiqun Liu; Fen Lin; Qin Liu; Qingyao Ai

arXiv:2208.07563·cs.IR·August 21, 2025·1 cites

Reinforcement Learning to Rank Using Coarse-grained Rewards

Yiteng Tu, Zhichao Xu, Tao Yang, Weihang Su, Yujia Zhou, Yiqun Liu, Fen Lin, Qin Liu, Qingyao Ai

PDF

Open Access

TL;DR

This paper explores reinforcement learning for ranking tasks using coarse-grained feedback signals, demonstrating that RL can outperform traditional supervised methods even with less detailed rewards.

Contribution

It introduces new RL-based methods tailored for coarse-grained rewards in ranking and systematically compares them with supervised approaches on large-scale benchmarks.

Findings

01

RL methods outperform supervised baselines with coarse rewards

02

RL can effectively optimize ranking without fine-grained labels

03

Large-scale experiments validate RL's potential in IR tasks

Abstract

Learning to rank (LTR) plays a crucial role in various Information Retrieval (IR) tasks. Although supervised LTR methods based on fine-grained relevance labels (e.g., document-level annotations) have achieved significant success, their reliance on costly and potentially biased annotations limits scalability and alignment with realistic goals. In contrast, coarse-grained feedback signals, such as duration time and session-level engagement, are more accessible and affordable. Reinforcement Learning (RL) offers a promising framework to directly optimize these objectives using reward signals, but most existing Reinforcement Learning to Rank (RLTR) approaches suffer from high variance and low sample efficiency. Motivated by recent advances in large language models (LLMs), we re-examine the problem of RLTR with coarse-grained rewards and propose new RLTR methods based on widely used RL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Mobile Crowdsensing and Crowdsourcing