Information-theoretic Evolution of Model Agnostic Global Explanations
Sukriti Verma, Nikaash Puri, Piyush Gupta, Balaji Krishnamurthy

TL;DR
This paper introduces a novel, model-agnostic, information-theoretic evolutionary approach to generate global explanations for machine learning models, improving robustness under distributional shifts and outperforming existing methods.
Contribution
It presents a new global explanation method that combines local explanations with evolutionary algorithms and introduces a robustness parameter for distributional shifts.
Findings
Outperforms existing explanation methods on various datasets.
Introduces a robustness parameter for distributional shift scenarios.
Enhances explanation quality by incorporating out-of-distribution samples.
Abstract
Explaining the behavior of black box machine learning models through human interpretable rules is an important research area. Recent work has focused on explaining model behavior locally i.e. for specific predictions as well as globally across the fields of vision, natural language, reinforcement learning and data science. We present a novel model-agnostic approach that derives rules to globally explain the behavior of classification models trained on numerical and/or categorical data. Our approach builds on top of existing local model explanation methods to extract conditions important for explaining model behavior for specific instances followed by an evolutionary algorithm that optimizes an information theory based fitness function to construct rules that explain global model behavior. We show how our approach outperforms existing approaches on a variety of datasets. Further, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Data Stream Mining Techniques
