Explainable Boosting Machines with Sparsity -- Maintaining Explainability in High-Dimensional Settings
Brandon M. Greenwell, Annika Dahlmann, Saurabh Dhoble

TL;DR
This paper introduces a LASSO-based post-processing method for explainable boosting machines (EBMs) to maintain transparency and efficiency in high-dimensional data by inducing sparsity and reducing complexity.
Contribution
The authors propose a simple LASSO-based approach to sparsify EBMs, preserving explainability and improving scoring speed in high-dimensional settings.
Findings
LASSO post-processing reduces the number of model terms significantly.
Sparsified EBMs maintain comparable accuracy to original models.
Scoring time is drastically improved with minimal loss of interpretability.
Abstract
Compared to "black-box" models, like random forests and deep neural networks, explainable boosting machines (EBMs) are considered "glass-box" models that can be competitively accurate while also maintaining a higher degree of transparency and explainability. However, EBMs become readily less transparent and harder to interpret in high-dimensional settings with many predictor variables; they also become more difficult to use in production due to increases in scoring time. We propose a simple solution based on the least absolute shrinkage and selection operator (LASSO) that can help introduce sparsity by reweighting the individual model terms and removing the less relevant ones, thereby allowing these models to maintain their transparency and relatively fast scoring times in higher-dimensional settings. In short, post-processing a fitted EBM with many (i.e., possibly hundreds or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Neural Networks and Applications
Methodsenergy-based model
