Comprehensive Stepwise Selection for Logistic Regression
Bernd Engelmann

TL;DR
This paper introduces a robust, multi-criteria stepwise selection algorithm for logistic regression that improves variable selection stability and accuracy over traditional methods, demonstrated through simulation.
Contribution
It proposes a comprehensive, multi-criteria stepwise algorithm for logistic regression that enhances model stability and selection accuracy compared to existing methods.
Findings
Simulation shows the method outperforms alternatives
Multiple criteria improve robustness
Selected models may be statistically equivalent
Abstract
Automated variable selection is widely applied in statistical model development. Algorithms like forward, backward or stepwise selection are available in statistical software packages like R and SAS. Many researchers have criticized the use of these algorithms because the models resulting from automated selection algorithms are not based on theory and tend to be unstable. Furthermore, simulation studies have shown that they often select incorrect variables due to random effects which makes these model building strategies unreliable. In this article, a comprehensive stepwise selection algorithm tailored to logistic regression is proposed. It uses multiple criteria in variable selection instead of relying on one single measure only, like a -value or Akaike's information criterion, which ensures robustness and soundness of the final outcome. The result of the selection process might not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models
