Improved prediction rule ensembling through model-based data generation
Benny Markovitch, Marjolein Fokkema

TL;DR
This paper introduces a novel approach to enhance prediction rule ensembles by using model-based data generation with surrogate models, improving sparsity and stability without sacrificing accuracy.
Contribution
It proposes two surrogate model methods to generate data for Lasso training, leading to more stable and sparse prediction rule ensembles compared to traditional methods.
Findings
Surrogate models improve the sparsity of PREs.
The nested surrogacy approach enhances stability and accuracy.
Method performs well on simulated and real datasets.
Abstract
Prediction rule ensembles (PRE) provide interpretable prediction models with relatively high accuracy.PRE obtain a large set of decision rules from a (boosted) decision tree ensemble, and achieves sparsitythrough application of Lasso-penalized regression. This article examines the use of surrogate modelsto improve performance of PRE, wherein the Lasso regression is trained with the help of a massivedataset generated by the (boosted) decision tree ensemble. This use of model-based data generationmay improve the stability and consistency of the Lasso step, thus leading to improved overallperformance. We propose two surrogacy approaches, and evaluate them on simulated and existingdatasets, in terms of sparsity and predictive accuracy. The results indicate that the use of surrogacymodels can substantially improve the sparsity of PRE, while retaining predictive accuracy, especiallythrough…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Forecasting Techniques and Applications · Machine Learning and Data Classification
