Unachievable Region in Precision-Recall Space and Its Effect on Empirical Evaluation
Kendrick Boyd (University of Wisconsin Madison), Vitor Santos Costa, (University of Porto), Jesse Davis (KU Leuven), David Page (University of, Wisconsin Madison)

TL;DR
This paper identifies an unachievable region in precision-recall space that depends solely on class skew, impacting how machine learning models are evaluated using PR curves.
Contribution
It precisely characterizes the unachievable region in PR space and discusses its implications for empirical evaluation methodology.
Findings
Unachievable region size depends only on class skew
PR curves vary with class skew
Implications for evaluation methodology
Abstract
Precision-recall (PR) curves and the areas under them are widely used to summarize machine learning results, especially for data sets exhibiting class skew. They are often used analogously to ROC curves and the area under ROC curves. It is known that PR curves vary as class skew changes. What was not recognized before this paper is that there is a region of PR space that is completely unachievable, and the size of this region depends only on the skew. This paper precisely characterizes the size of that region and discusses its implications for empirical evaluation methodology in machine learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Fault Detection and Control Systems
