Gaining Free or Low-Cost Transparency with Interpretable Partial Substitute
Tong Wang

TL;DR
This paper introduces a hybrid rule set model that provides interpretable approximations of black-box models on specific data subsets, achieving transparency with minimal or no loss in predictive accuracy.
Contribution
The paper proposes a novel Hybrid Rule Sets (HyRS) framework that efficiently finds interpretable substitutes for black-box models on data subspaces, enhancing transparency without sacrificing performance.
Findings
HyRS achieves a good balance between interpretability and accuracy.
The search algorithm efficiently finds optimal rule-based models.
Experiments demonstrate effectiveness on structured and text data.
Abstract
This work addresses the situation where a black-box model with good predictive performance is chosen over its interpretable competitors, and we show interpretability is still achievable in this case. Our solution is to find an interpretable substitute on a subset of data where the black-box model is overkill or nearly overkill while leaving the rest to the black-box. This transparency is obtained at minimal cost or no cost of the predictive performance. Under this framework, we develop a Hybrid Rule Sets (HyRS) model that uses decision rules to capture the subspace of data where the rules are as accurate or almost as accurate as the black-box provided. To train a HyRS, we devise an efficient search algorithm that iteratively finds the optimal model and exploits theoretically grounded strategies to reduce computation. Our framework is agnostic to the black-box during training.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Explainable Artificial Intelligence (XAI)
MethodsInterpretability
