Best-Case Retrieval Evaluation: Improving the Sensitivity of Reciprocal Rank with Lexicographic Precision
Fernando Diaz

TL;DR
This paper introduces lexicographic precision, a new evaluation metric that enhances the sensitivity and robustness of reciprocal rank in ranking tasks, especially for high-precision systems.
Contribution
It generalizes reciprocal rank through best-case retrieval and proposes lexicographic precision, improving discrimination between systems in ranking evaluations.
Findings
Lexiprecision preserves reciprocal rank differences.
Empirically improves sensitivity across various tasks.
Enhances robustness in high-precision system evaluation.
Abstract
Across a variety of ranking tasks, researchers use reciprocal rank to measure the effectiveness for users interested in exactly one relevant item. Despite its widespread use, evidence suggests that reciprocal rank is brittle when discriminating between systems. This brittleness, in turn, is compounded in modern evaluation settings where current, high-precision systems may be difficult to distinguish. We address the lack of sensitivity of reciprocal rank by introducing and connecting it to the concept of best-case retrieval, an evaluation method focusing on assessing the quality of a ranking for the most satisfied possible user across possible recall requirements. This perspective allows us to generalize reciprocal rank and define a new preference-based evaluation we call lexicographic precision or lexiprecision. By mathematical construction, we ensure that lexiprecision preserves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Criteria Decision Making · Information Retrieval and Search Behavior · Expert finding and Q&A systems
