Split Regression Modeling
Anthony Christidis, Stefan Van Aelst, Ruben Zamar

TL;DR
This paper introduces best split selection, an ensemble approach that combines interpretability and high prediction accuracy by constructing multiple sparse models through variable splitting, addressing computational challenges with a relaxation method.
Contribution
It generalizes best subset selection to a new ensemble method called best split selection, balancing interpretability and accuracy in predictive modeling.
Findings
Ensemble of sparse, diverse models enhances interpretability.
Method achieves high prediction accuracy comparable to blackbox methods.
Computationally tractable approximation effectively relaxes the original method.
Abstract
Sparse methods are the standard approach to obtain interpretable models with high prediction accuracy. Alternatively, algorithmic ensemble methods can achieve higher prediction accuracy at the cost of loss of interpretability. However, the use of blackbox methods has been heavily criticized for high-stakes decisions and it has been argued that there does not have to be a trade-off between accuracy and interpretability. To combine high accuracy with interpretability, we generalize best subset selection to best split selection. Best split selection constructs a small number of sparse models learned jointly from the data which are then combined in an ensemble. Best split selection determines the models by splitting the available predictor variables among the different models when fitting the data. The proposed methodology results in an ensemble of sparse and diverse models that each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Genetic and phenotypic traits in livestock · Advanced Statistical Methods and Models
