The Susceptibility of Example-Based Explainability Methods to Class Outliers
Ikhtiyor Nematov, Dimitris Sacharidis, Tomer Sagi, Katja Hose

TL;DR
This paper investigates how class outliers affect the reliability of example-based explainability methods in machine learning, revealing their vulnerabilities and proposing evaluation metrics to improve robustness.
Contribution
It reformulates evaluation metrics for example-based explanations and introduces a new metric, distinguishability, to better assess method performance against class outliers.
Findings
Current methods are vulnerable to class outliers.
Existing metrics may not fully capture explainability quality.
Robust techniques are needed to handle class outliers effectively.
Abstract
This study explores the impact of class outliers on the effectiveness of example-based explainability methods for black-box machine learning models. We reformulate existing explainability evaluation metrics, such as correctness and relevance, specifically for example-based methods, and introduce a new metric, distinguishability. Using these metrics, we highlight the shortcomings of current example-based explainability methods, including those who attempt to suppress class outliers. We conduct experiments on two datasets, a text classification dataset and an image classification dataset, and evaluate the performance of four state-of-the-art explainability methods. Our findings underscore the need for robust techniques to tackle the challenges posed by class outliers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications
