JudgeRank: Leveraging Large Language Models for Reasoning-Intensive Reranking
Tong Niu, Shafiq Joty, Ye Liu, Caiming Xiong, Yingbo Zhou, Semih Yavuz

TL;DR
JudgeRank is a novel agentic reranker leveraging large language models to emulate human reasoning, significantly improving document relevance assessment in reasoning-intensive retrieval tasks and demonstrating strong zero-shot generalization.
Contribution
It introduces JudgeRank, a new agentic reranking approach that mimics human reasoning, enhancing relevance judgment in reasoning-intensive retrieval scenarios.
Findings
Outperforms existing rerankers on the BRIGHT benchmark.
Achieves comparable results to fine-tuned models on BEIR.
Ensembling multiple LLMs further improves reranking accuracy.
Abstract
Accurate document retrieval is crucial for the success of retrieval-augmented generation (RAG) applications, including open-domain question answering and code completion. While large language models (LLMs) have been employed as dense encoders or listwise rerankers in RAG systems, they often struggle with reasoning-intensive tasks because they lack nuanced analysis when judging document relevance. To address this limitation, we introduce JudgeRank, a novel agentic reranker that emulates human cognitive processes when assessing document relevance. Our approach consists of three key steps: (1) query analysis to identify the core problem, (2) document analysis to extract a query-aware summary, and (3) relevance judgment to provide a concise assessment of document relevance. We evaluate JudgeRank on the reasoning-intensive BRIGHT benchmark, demonstrating substantial performance improvements…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Softmax · Dropout · Dense Connections · Layer Normalization · Linear Warmup With Linear Decay · WordPiece · Adam · Attention Is All You Need
