On the variability of regression shrinkage methods for clinical prediction models: simulation study on predictive performance
Ben Van Calster, Maarten van Smeden, Ewout W. Steyerberg

TL;DR
This simulation study evaluates how various regression shrinkage methods affect predictive performance and calibration variability in clinical risk models, highlighting that shrinkage often improves average calibration but can increase variability across samples.
Contribution
It provides a comprehensive comparison of multiple shrinkage methods, revealing their effects on calibration slope variability and their limitations in small sample scenarios.
Findings
Shrinkage improves average calibration slopes.
Shrinkage methods often increase variability across samples.
Bootstrap-based uniform shrinkage performs well overall.
Abstract
When developing risk prediction models, shrinkage methods are recommended, especially when the sample size is limited. Several earlier studies have shown that the shrinkage of model coefficients can reduce overfitting of the prediction model and subsequently result in better predictive performance on average. In this simulation study, we aimed to investigate the variability of regression shrinkage on predictive performance for a binary outcome, with focus on the calibration slope. The slope indicates whether risk predictions are too extreme (slope < 1) or not extreme enough (slope > 1). We investigated the following shrinkage methods in comparison to standard maximum likelihood estimation: uniform shrinkage (likelihood-based and bootstrap-based), ridge regression, penalized maximum likelihood, LASSO regression, adaptive LASSO, non-negative garrote, and Firth's correction. There were…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Machine Learning in Healthcare · Statistical Methods in Epidemiology
