TL;DR
This paper investigates the tradeoff between precision and recall in ranking models, establishing the optimal F-score parameter and providing tools to find the best balance for specific distributions.
Contribution
It introduces a framework for understanding and optimizing the tradeoff between precision and recall using Kendall rank correlations and provides a closed-form solution for the optimal F-score parameter.
Findings
F-score induced rankings are meaningful and define a shortest path between precision and recall rankings.
The commonly used F1 score is often not optimal for tradeoffs between precision and recall.
The paper provides a closed-form expression to compute the optimal beta for any performance distribution.
Abstract
Ranking methods or models based on their performance is of prime importance but is tricky because performance is fundamentally multidimensional. In the case of classification, precision and recall are scores with probabilistic interpretations that are both important to consider and complementary. The rankings induced by these two scores are often in partial contradiction. In practice, therefore, it is extremely useful to establish a compromise between the two views to obtain a single, global ranking. Over the last fifty years or so, it has been proposed to take a weighted harmonic mean, known as the F-score, F-measure, or . Generally speaking, by averaging basic scores, we obtain a score that is intermediate in terms of values. However, there is no guarantee that these scores lead to meaningful rankings and no guarantee that the rankings are good tradeoffs between these base…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
