TL;DR
This paper introduces probabilistic sufficient explanations that identify minimal feature subsets providing strong probabilistic guarantees for model behavior, using scalable probabilistic reasoning tools.
Contribution
It proposes a novel explanation framework based on probabilistic guarantees and develops a scalable algorithm leveraging probabilistic circuits for explanation generation.
Findings
The algorithm effectively finds sufficient explanations with probabilistic guarantees.
Compared to Anchors and logical explanations, our method shows improved scalability and explanation quality.
Experimental results validate the approach's effectiveness in real-world classifiers.
Abstract
Understanding the behavior of learned classifiers is an important task, and various black-box explanations, logical reasoning approaches, and model-specific methods have been proposed. In this paper, we introduce probabilistic sufficient explanations, which formulate explaining an instance of classification as choosing the "simplest" subset of features such that only observing those features is "sufficient" to explain the classification. That is, sufficient to give us strong probabilistic guarantees that the model will behave similarly when all features are observed under the data distribution. In addition, we leverage tractable probabilistic reasoning tools such as probabilistic circuits and expected predictions to design a scalable algorithm for finding the desired explanations while keeping the guarantees intact. Our experiments demonstrate the effectiveness of our algorithm in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
