RCProb: Probabilistic Rule Extraction for Efficient Simplification of Tree Ensembles
Josue Obregon

TL;DR
RCProb is a probabilistic rule extraction method for tree ensembles that significantly reduces computational cost while maintaining accuracy and producing more compact, interpretable rule sets.
Contribution
It introduces a probabilistic reformulation of RuleCOSI+ that avoids repeated data scans, leading to faster rule extraction without sacrificing performance.
Findings
RCProb reduces runtime by approximately 22 times compared to RuleCOSI+
RCProb maintains competitive predictive performance on benchmark datasets
RCProb produces more compact rule sets on average
Abstract
Tree ensembles are widely used in industrial machine learning due to their strong predictive performance and efficient training procedures. However, as the number of trees in an ensemble grows, the resulting models become increasingly difficult for humans to interpret. To address this limitation, explainable artificial intelligence (XAI) studies methods that generate interpretable models capable of explaining complex predictors. One approach consists of extracting decision rules from tree ensembles while attempting to preserve the predictive performance of the original model. In previous work, we introduced RuleCOSI+, a greedy heuristic algorithm for extracting compact rule-based models from tree ensembles. Although RuleCOSI+ produces accurate and interpretable rule sets, it relies on repeated empirical frequency counting over the training data to estimate rule confidence, which becomes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
