TL;DR
This paper introduces a new measure called Bits-over-Random (BoR) to evaluate retrieval selectivity, revealing that high success rates can mask random-level performance, especially at larger retrieval depths.
Contribution
The paper proposes BoR as a chance-corrected measure of retrieval selectivity and demonstrates its effectiveness in exposing when high success rates are misleading.
Findings
High success rates at large depths often indicate random performance.
BoR remains positive in large-scale datasets, confirming baseline predictions.
Selectivity collapse occurs when retrieval depth exceeds certain thresholds.
Abstract
For most of the history of information retrieval (IR), search results were designed for human consumers who could scan, filter, and discard irrelevant information on their own. This shaped retrieval systems to optimize for finding and ranking more relevant documents, but not keeping results clean and minimal, as the human was the final filter. However, LLMs have changed that by lacking this filtering ability. To address this, we introduce Bits-over-Random (BoR), a chance-corrected measure of retrieval selectivity that reveals when high success rates mask random-level performance. We measure selectivity as , where is the hypergeometric baseline for the chosen success rule (here, coverage: relevant in top-). On the 20 Newsgroups dataset, BM25 and SPLADE both report % success at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
