ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking

Xianming Li; Aamir Shakir; Rui Huang; Tsz-fung Andrew Lee; Julius Lipp; Benjamin Clavi\'e; Jing Li

arXiv:2506.03487·cs.IR·April 17, 2026

ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking

Xianming Li, Aamir Shakir, Rui Huang, Tsz-fung Andrew Lee, Julius Lipp, Benjamin Clavi\'e, Jing Li

PDF

5 Models

TL;DR

ProRank is a novel two-stage training method using reinforcement learning and fine-grained score learning to enhance small language models for document reranking, achieving performance comparable to or better than large models.

Contribution

ProRank introduces a reinforcement learning-based prompt understanding and score learning approach to significantly improve small language models' reranking capabilities.

Findings

01

ProRank outperforms state-of-the-art open-source and proprietary rerankers.

02

A 0.5B ProRank model surpasses large LLM rerankers on the BEIR benchmark.

03

Proper training enables small models to achieve high reranking performance.

Abstract

Reranking is fundamental to information retrieval and retrieval-augmented generation, with recent Large Language Models (LLMs) significantly advancing reranking quality. Most current works rely on large-scale LLMs (>7B parameters), presenting high computational costs. Small Language Models (SLMs) offer a promising alternative because of computational efficiency. However, our preliminary quantitative analysis reveals key limitations of SLMs: their representation space is narrow, leading to reduced expressiveness, and they struggle with understanding task prompts without fine-tuning. To address these issues, we introduce a novel two-stage training approach, ProRank, for SLM-based document reranking. We propose using reinforcement learning to improve the understanding of task prompts. Additionally, we introduce fine-grained score learning to enhance representation expressiveness and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.