Policy-Gradient Training of Language Models for Ranking

Ge Gao; Jonathan D. Chang; Claire Cardie; Kiant\'e Brantley; Thorsten; Joachim

arXiv:2310.04407·cs.CL·November 26, 2024

Policy-Gradient Training of Language Models for Ranking

Ge Gao, Jonathan D. Chang, Claire Cardie, Kiant\'e Brantley, Thorsten, Joachim

PDF

Open Access 5 Datasets

TL;DR

This paper introduces Neural PG-RANK, a policy gradient-based training method for language model retrievers that directly optimizes ranking quality, reducing reliance on heuristics and improving performance in text retrieval tasks.

Contribution

Neural PG-RANK is a novel end-to-end training algorithm that models ranking as a Plackett-Luce policy, aligning training objectives with downstream decision quality.

Findings

01

Significant in-domain performance improvements.

02

Enhanced out-of-domain generalization.

03

Effective unification of training and decision metrics.

Abstract

Text retrieval plays a crucial role in incorporating factual knowledge for decision making into language processing pipelines, ranging from chat-based web search to question answering systems. Current state-of-the-art text retrieval models leverage pre-trained large language models (LLMs) to achieve competitive performance, but training LLM-based retrievers via typical contrastive losses requires intricate heuristics, including selecting hard negatives and using additional supervision as learning signals. This reliance on heuristics stems from the fact that the contrastive loss itself is heuristic and does not directly optimize the downstream metrics of decision quality at the end of the processing pipeline. To address this issue, we introduce Neural PG-RANK, a novel training algorithm that learns to rank by instantiating a LLM as a Plackett-Luce ranking policy. Neural PG-RANK provides…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Expert finding and Q&A systems · Multimodal Machine Learning Applications