Behavior of k-NN as an Instance-Based Explanation Method
Chhavi Yadav, Kamalika Chaudhuri

TL;DR
This paper investigates the behavior of k-nearest neighbors as an instance-based explanation method for neural network predictions, comparing it to influence functions and analyzing stability across datasets.
Contribution
It empirically evaluates k-NN explanations in neural networks, compares them with influence functions, and analyzes their stability and behavior across datasets.
Findings
k-NN in the last neural network layer is effective for explanations
No clear trend between k and prediction change
High stability observed in MNIST compared to CIFAR-10
Abstract
Adoption of DL models in critical areas has led to an escalating demand for sound explanation methods. Instance-based explanation methods are a popular type that return selective instances from the training set to explain the predictions for a test sample. One way to connect these explanations with prediction is to ask the following counterfactual question - how does the loss and prediction for a test sample change when explanations are removed from the training set? Our paper answers this question for k-NNs which are natural contenders for an instance-based explanation method. We first demonstrate empirically that the representation space induced by last layer of a neural network is the best to perform k-NN in. Using this layer, we conduct our experiments and compare them to influence functions (IFs) ~\cite{koh2017understanding} which try to answer a similar question. Our evaluations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodsk-Nearest Neighbors
