Cautious Deep Learning
Yotam Hechtlinger, Barnab\'as P\'oczos, Larry Wasserman

TL;DR
This paper introduces a cautious classification method using conformal prediction sets based on $p(x|y)$, which reduces overconfidence and improves robustness against adversarial attacks by allowing the classifier to abstain from making a prediction.
Contribution
It proposes a novel conformal prediction approach that emphasizes cautious predictions and null sets, enhancing reliability and adversarial robustness in high-dimensional image classification.
Findings
Achieves high coverage with conformal prediction sets on ImageNet, CelebA, and IMDB-Wiki datasets.
Reduces false confident predictions and abstains when data does not resemble training examples.
Improves robustness to adversarial attacks by predicting null sets or including true labels.
Abstract
Most classifiers operate by selecting the maximum of an estimate of the conditional distribution where stands for the features of the instance to be classified and denotes its label. This often results in a {\em hubristic bias}: overconfidence in the assignment of a definite label. Usually, the observations are concentrated on a small volume but the classifier provides definite predictions for the entire space. We propose constructing conformal prediction sets which contain a set of labels rather than a single label. These conformal prediction sets contain the true label with probability . Our construction is based on rather than which results in a classifier that is very cautious: it outputs the null set --- meaning "I don't know" --- when the object does not resemble the training examples. An important property of our approach is that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
