Revisiting Methods for Finding Influential Examples
Karthikeyan K, Anders S{\o}gaard

TL;DR
This paper critically examines existing methods for identifying influential training examples, revealing their instability and proposing a new evaluation metric based on poisoning attack detection, along with a simple baseline for improvement.
Contribution
It highlights the instability of current influence methods, challenges their evaluation standards, and introduces a new, more effective baseline and evaluation approach.
Findings
Existing influence methods are highly sensitive to initialization and data ordering.
LOO influence and heuristics are poor metrics for explanation quality.
A simple baseline significantly improves influence estimation performance.
Abstract
Several instance-based explainability methods for finding influential training examples for test-time decisions have been proposed recently, including Influence Functions, TraceIn, Representer Point Selection, Grad-Dot, and Grad-Cos. Typically these methods are evaluated using LOO influence (Cook's distance) as a gold standard, or using various heuristics. In this paper, we show that all of the above methods are unstable, i.e., extremely sensitive to initialization, ordering of the training data, and batch size. We suggest that this is a natural consequence of how in the literature, the influence of examples is assumed to be independent of model state and other examples -- and argue it is not. We show that LOO influence and heuristics are, as a result, poor metrics to measure the quality of instance-based explanations, and instead propose to evaluate such explanations by their ability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
