On Explaining Random Forests with SAT

Yacine Izza; Joao Marques-Silva

arXiv:2105.10278·cs.LG·May 24, 2021·1 cites

On Explaining Random Forests with SAT

Yacine Izza, Joao Marques-Silva

PDF

Open Access

TL;DR

This paper proves that explaining Random Forests is computationally hard, introduces a SAT-based method for explanations, and demonstrates its efficiency and scalability over existing heuristics on real datasets.

Contribution

It establishes the computational complexity of explaining RFs and proposes a novel SAT-based approach that outperforms heuristics in practice.

Findings

01

SAT-based explanations scale well to large RFs

02

The approach outperforms existing heuristics on most datasets

03

Computing explanations of RFs is D^P-complete

Abstract

Random Forest (RFs) are among the most widely used Machine Learning (ML) classifiers. Even though RFs are not interpretable, there are no dedicated non-heuristic approaches for computing explanations of RFs. Moreover, there is recent work on polynomial algorithms for explaining ML models, including naive Bayes classifiers. Hence, one question is whether finding explanations of RFs can be solved in polynomial time. This paper answers this question negatively, by proving that computing one PI-explanation of an RF is D^P-complete. Furthermore, the paper proposes a propositional encoding for computing explanations of RFs, thus enabling finding PI-explanations with a SAT solver. This contrasts with earlier work on explaining boosted trees (BTs) and neural networks (NNs), which requires encodings based on SMT/MILP. Experimental results, obtained on a wide range of publicly available datasets,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference · Machine Learning and Data Classification