Counterfactual Gradients-based Quantification of Prediction Trust in Neural Networks
Mohit Prabhushankar, Ghassan AlRegib

TL;DR
GradTrust is a novel trust measure for neural networks that uses counterfactual gradient variance to detect mispredictions, outperforming existing methods across multiple architectures and datasets.
Contribution
The paper introduces GradTrust, a new method leveraging counterfactual gradient variance for quantifying neural network prediction trust at inference.
Findings
GradTrust outperforms existing techniques in misprediction detection on ImageNet.
Simple classifiers like negative log likelihood outperform state-of-the-art uncertainty methods.
GradTrust ranks among the top methods in most experimental settings.
Abstract
The widespread adoption of deep neural networks in machine learning calls for an objective quantification of esoteric trust. In this paper we propose GradTrust, a classification trust measure for large-scale neural networks at inference. The proposed method utilizes variance of counterfactual gradients, i.e. the required changes in the network parameters if the label were different. We show that GradTrust is superior to existing techniques for detecting misprediction rates on images from ImageNet validation dataset. Depending on the network, GradTrust detects images where either the ground truth is incorrect or ambiguous, or the classes are co-occurring. We extend GradTrust to Video Action Recognition on Kinetics-400 dataset. We showcase results on architectures pretrained on ImageNet and architectures pretrained on Kinetics-400. We observe the following: (i) simple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications
