Dividing and Conquering a BlackBox to a Mixture of Interpretable Models:   Route, Interpret, Repeat

Shantanu Ghosh; Ke Yu; Forough Arabshahi; Kayhan Batmanghelich

arXiv:2307.05350·cs.LG·July 13, 2023

Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat

Shantanu Ghosh, Ke Yu, Forough Arabshahi, Kayhan Batmanghelich

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a method to decompose a Blackbox model into a mixture of interpretable models and residuals, enabling better explanations, concept discovery, and improved test-time interventions without sacrificing performance.

Contribution

It proposes an iterative approach to carve out interpretable experts from a Blackbox, enhancing interpretability and explanation quality while maintaining predictive accuracy.

Findings

01

Identifies diverse, instance-specific concepts with high completeness.

02

Effectively isolates harder-to-explain samples via residuals.

03

Outperforms interpretable-by-design models in test-time interventions.

Abstract

ML model design either starts with an interpretable model or a Blackbox and explains it post hoc. Blackbox models are flexible but difficult to explain, while interpretable models are inherently explainable. Yet, interpretable models require extensive ML knowledge and tend to be less flexible and underperforming than their Blackbox variants. This paper aims to blur the distinction between a post hoc explanation of a Blackbox and constructing interpretable models. Beginning with a Blackbox, we iteratively carve out a mixture of interpretable experts (MoIE) and a residual network. Each interpretable model specializes in a subset of samples and explains them using First Order Logic (FOL), providing basic reasoning on concepts from the Blackbox. We route the remaining samples through a flexible residual. We repeat the method on the residual network until all the interpretable models explain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

batmanlab/ICML-2023-Route-interpret-repeat
pytorchOfficial

Videos

Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat· slideslive

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Software Engineering Research

MethodsHigh-Order Consensuses