Stop Overthinking: Unlocking Efficient Listwise Reranking with Minimal Reasoning

Danyang Liu; Kan Li

arXiv:2605.14450·cs.IR·May 15, 2026

Stop Overthinking: Unlocking Efficient Listwise Reranking with Minimal Reasoning

Danyang Liu, Kan Li

PDF

TL;DR

This paper introduces a Length-Regularized Self-Distillation method that reduces reasoning tokens in listwise reranking with LLMs, maintaining effectiveness while improving efficiency for real-time applications.

Contribution

It proposes a novel framework that synthesizes high-quality, minimal reasoning traces to train models that are both accurate and computationally efficient.

Findings

01

Reduces inference token usage by 34%-37% across benchmarks.

02

Maintains ranking effectiveness comparable to larger models.

03

Addresses overthinking by pruning redundant reasoning in LLM-based rerankers.

Abstract

Listwise reranking utilizing Large Language Models (LLMs) has achieved state-of-the-art retrieval effectiveness. Recently, reasoning-enhanced models have further pushed these boundaries by employing Chain-of-Thought (CoT) to perform deep comparative analysis of candidate documents. However, this performance gain comes at a prohibitive computational cost, as models often generate thousands of reasoning tokens before producing a final ranking. In this work, we investigate the relationship between reasoning length and ranking quality, revealing an overthinking phenomenon where extended reasoning yields diminishing returns. To address this, we propose a Length-Regularized Self-Distillation framework. We synthesize a dataset by sampling diverse reasoning traces from a teacher model (Rank-K) and applying a Pareto-inspired filter to select traces that achieve high ranking performance with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.