Global Aggregations of Local Explanations for Black Box models
Ilse van der Linden, Hinda Haned, Evangelos Kanoulas

TL;DR
This paper introduces GALE, a method for aggregating local explanations to better understand the global decision-making process of black box models, addressing limitations of existing approaches like LIME.
Contribution
The paper proposes GALE, a novel aggregation technique that improves the reliability of global insights derived from local explanations of black box models.
Findings
GALE provides more accurate global feature importance than LIME.
Aggregation choice significantly impacts the quality of global explanations.
GALE effectively identifies features that distinguish model behavior.
Abstract
The decision-making process of many state-of-the-art machine learning models is inherently inscrutable to the extent that it is impossible for a human to interpret the model directly: they are black box models. This has led to a call for research on explaining black box models, for which there are two main approaches. Global explanations that aim to explain a model's decision making process in general, and local explanations that aim to explain a single prediction. Since it remains challenging to establish fidelity to black box models in globally interpretable approximations, much attention is put on local explanations. However, whether local explanations are able to reliably represent the black box model and provide useful insights remains an open question. We present Global Aggregations of Local Explanations (GALE) with the objective to provide insights in a model's global decision…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
