Integrating White and Black Box Techniques for Interpretable Machine   Learning

Eric M. Vernon; Naoki Masuyama; and Yusuke Nojima

arXiv:2407.08973·cs.LG·July 15, 2024

Integrating White and Black Box Techniques for Interpretable Machine Learning

Eric M. Vernon, Naoki Masuyama, and Yusuke Nojima

PDF

Open Access

TL;DR

This paper proposes an ensemble approach that combines interpretable white box models for simple inputs with complex black box models for difficult inputs to balance interpretability and accuracy in machine learning.

Contribution

It introduces a novel ensemble classifier that adaptively uses white and black box models based on input difficulty, enhancing interpretability without sacrificing performance.

Findings

01

Improved interpretability for easy inputs

02

Maintained high accuracy on complex inputs

03

Demonstrated effectiveness on benchmark datasets

Abstract

In machine learning algorithm design, there exists a trade-off between the interpretability and performance of the algorithm. In general, algorithms which are simpler and easier for humans to comprehend tend to show worse performance than more complex, less transparent algorithms. For example, a random forest classifier is likely to be more accurate than a simple decision tree, but at the expense of interpretability. In this paper, we present an ensemble classifier design which classifies easier inputs using a highly-interpretable classifier (i.e., white box model), and more difficult inputs using a more powerful, but less interpretable classifier (i.e., black box model).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)