Correspondence: The use of cost information when defining critical values for prediction of rare events using logistic regression and similar methods
Paul T Seed

TL;DR
This paper discusses how to incorporate cost information into logistic regression to determine optimal cutoff points for predicting rare events, such as diseases, by balancing false positive and false negative costs.
Contribution
It introduces a method to use cost information directly in logistic regression to define critical probability thresholds for rare event prediction.
Findings
Cost-based cutoff points improve prediction accuracy for rare events.
Standard logistic regression outputs can be adapted using cost considerations.
The method aids in decision-making where misclassification costs vary.
Abstract
Balancing a rare and serious possibility against a more common and less serious one is a familiar problem in many situations, such as the prediction of rare diseases. The relative costs of forecasting errors can be used for any prediction method that gives an estimated probability of a future event. The probability at which the likely cost (defined as cost x probability) of a possible false negative is exactly equal to that of a possible false positive gives the relevant cutpoint and all subjects with probability of disease greater than this have a positive test result. All standard methods of logistic regression will give the log-odds and hence the predicted probability of a positive outcome for every subject:
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology
