TL;DR
This paper introduces a new t-distribution based operator for neural network classifiers that improves robustness to out-of-distribution samples by better modeling uncertainty, with minimal architectural changes.
Contribution
A novel operator derived from t-distributions is proposed, enhancing out-of-distribution robustness of neural classifiers compared to standard softmax.
Findings
Classifiers with the t-distribution operator outperform softmax-based models on OOD detection.
The new operator provides more reliable uncertainty estimates.
Minimal architectural modifications are required for implementation.
Abstract
Neural Network (NN) classifiers can assign extreme probabilities to samples that have not appeared during training (out-of-distribution samples) resulting in erroneous and unreliable predictions. One of the causes for this unwanted behaviour lies in the use of the standard softmax operator which pushes the posterior probabilities to be either zero or unity hence failing to model uncertainty. The statistical derivation of the softmax operator relies on the assumption that the distributions of the latent variables for a given class are Gaussian with known variance. However, it is possible to use different assumptions in the same derivation and attain from other families of distributions as well. This allows derivation of novel operators with more favourable properties. Here, a novel operator is proposed that is derived using -distributions which are capable of providing a better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Softmax
