Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers
Shijie Chen, Bernal Jim\'enez Guti\'errez, Yu Su

TL;DR
This paper introduces in-context re-ranking (ICR), a novel, efficient method leveraging attention patterns in large language models to improve zero-shot document re-ranking without generative capabilities.
Contribution
The paper proposes ICR, a new re-ranking approach that uses attention signals in LLMs, reducing computational cost and enabling broader application without specialized training.
Findings
ICR outperforms RankGPT on standard IR benchmarks.
ICR reduces latency by over 60% compared to generative methods.
ICR performs especially well on complex re-ranking tasks.
Abstract
Information retrieval (IR) systems have played a vital role in modern digital life and have cemented their continued usefulness in this new era of generative AI via retrieval-augmented generation. With strong language processing capabilities and remarkable versatility, large language models (LLMs) have become popular choices for zero-shot re-ranking in IR systems. So far, LLM-based re-ranking methods rely on strong generative capabilities, which restricts their use to either specialized or powerful proprietary models. Given these restrictions, we ask: is autoregressive generation necessary and optimal for LLMs to perform re-ranking? We hypothesize that there are abundant signals relevant to re-ranking within LLMs that might not be used to their full potential via generation. To more directly leverage such signals, we propose in-context re-ranking (ICR), a novel method that leverages the…
Peer Reviews
Decision·ICLR 2025 Poster
1. The writing of this paper is well-crafted, allowing readers with relevant backgrounds to quickly follow and provide feedback. 2. The proposed method is versatile and applicable, making it suitable for use with open-source LLMs.
1. The claim of O(1) LLM forward passes raises significant questions. Please refer to my Question 1 for a detailed explanation, as this will determine the overall quality of the paper. 2. A limitation of this method is that ordinary users cannot implement it using advanced commercial large language models, especially when compared to RankGPT. However, I believe this is due to commercial factors rather than technical ones. 3. When only comparing against a single baseline, RankGPT, the classic dat
1. The problem is important, and the overall writting is clear. 2. The proposed method is easy to follow and demonstrates effectiveness on two public LLMs.
My primary concern is the insufficient comparison with other methods and a lack of depth in experimental analysis. 1)Although the paper compares ICR with RankGPT, highlighting improvements, it would be strengthened by comparisons with more methods, especially other zero-shot listwise methods. [1] RankVicuna: Zero-Shot Listwise Document Reranking with Open-Source Large Language Models. [2] A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models. 2)
The motivation is "whether autoregressive generation necessary and optimal for LLMs to perform re-ranking?", which is quite interesting and worth discussing. The proposed method is technical sound and is effective on various datasets of re-ranking tasks. The paper is well-written and easy to follow. The experiments and discussions are thorough.
The paper lacks more high-level intuition and explanations especially about why the proposed method works intuitively. What are the advantages and disadvantages of the proposed methods intuitively. Why the proposed method works well although the model's instruction following ability is poor?
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsSoftmax · Attention Is All You Need
