Tuning in ridge logistic regression to solve separation
Hana \v{S}inkovec, Angelika Geroldinger, Georg Heinze, Rok, Blagus

TL;DR
This paper explores the use of ridge regression with a new bootstrap-based tuning method to effectively address separation in logistic regression, outperforming Firth's correction in small, sparse, and correlated data scenarios.
Contribution
It introduces a bootstrap-based tuning criterion for ridge regression that ensures shrinkage and valid inference in separation problems, offering an alternative to Firth's correction.
Findings
B-tuned ridge regression reduces MSE compared to Firth's correction in small datasets.
The new method provides confidence intervals with near-nominal coverage.
Performance is demonstrated through oncology data and simulation studies.
Abstract
Separation in logistic regression is a common problem causing failure of the iterative estimation process when finding maximum likelihood estimates. Firth's correction (FC) was proposed as a solution, providing estimates also in presence of separation. In this paper we evaluate whether ridge regression (RR) could be considered instead, specifically, if it could reduce the mean squared error (MSE) of coefficient estimates in comparison to FC. In RR the tuning parameter determining the penalty strength is usually obtained by minimizing some measure of the out-of-sample prediction error or information criterion. However, in presence of separation tuning these measures can yield an optimized value of zero (no shrinkage), and hence cannot provide a universal solution. We derive a new bootstrap based tuning criterion that always leads to shrinkage. Moreover, we demonstrate how valid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Statistical Methods and Models · Statistical and numerical algorithms
