cutpointr: Improved Estimation and Validation of Optimal Cutpoints in R
Christian Thiele, Gerrit Hirschfeld

TL;DR
The paper introduces the cutpointr R package, which provides robust, flexible methods for estimating and validating optimal cutpoints in binary classification, addressing variability and overestimation issues of traditional approaches.
Contribution
The package implements advanced statistical techniques like bootstrapping, smoothing, and user-defined metrics for more reliable cutpoint estimation and validation in binary classification tasks.
Findings
Enhanced stability of cutpoint estimates
Reduced overfitting in performance metrics
Versatile tools for visualization and analysis
Abstract
'Optimal cutpoints' for binary classification tasks are often established by testing which cutpoint yields the best discrimination, for example the Youden index, in a specific sample. This results in 'optimal' cutpoints that are highly variable and systematically overestimate the out-of-sample performance. To address these concerns, the cutpointr package offers robust methods for estimating optimal cutpoints and the out-of-sample performance. The robust methods include bootstrapping and smoothing based on kernel estimation, generalized additive models, smoothing splines, and local regression. These methods can be applied to a wide range of binary-classification and cost-based metrics. cutpointr also provides mechanisms to utilize user-defined metrics and estimation methods. The package has capabilities for parallelization of the bootstrapping, including reproducible random number…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Statistical Methods and Models · Gaussian Processes and Bayesian Inference
