A Flexible Nadaraya-Watson Head Can Offer Explainable and Calibrated Classification
Alan Q. Wang, Mert R. Sabuncu

TL;DR
This paper investigates a nonparametric Nadaraya-Watson prediction head for neural networks, demonstrating its advantages in interpretability, calibration, and efficiency across computer vision tasks, especially in data-limited scenarios.
Contribution
It introduces a simple, non-learnable NW head for neural networks, offering better calibration and interpretability, along with a clustering-based method for efficient inference.
Findings
Better calibration than parametric heads in data-limited settings
Comparable accuracy with improved interpretability
Support influence function aids model debugging
Abstract
In this paper, we empirically analyze a simple, non-learnable, and nonparametric Nadaraya-Watson (NW) prediction head that can be used with any neural network architecture. In the NW head, the prediction is a weighted average of labels from a support set. The weights are computed from distances between the query feature and support features. This is in contrast to the dominant approach of using a learnable classification head (e.g., a fully-connected layer) on the features, which can be challenging to interpret and can yield poorly calibrated predictions. Our empirical results on an array of computer vision tasks demonstrate that the NW head can yield better calibration with comparable accuracy compared to its parametric counterpart, particularly in data-limited settings. To further increase inference-time efficiency, we propose a simple approach that involves a clustering step run on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications
