Mitigating Overconfidence in Out-of-Distribution Detection by Capturing   Extreme Activations

Mohammad Azizmalayeri; Ameen Abu-Hanna; Giovanni Cin\`a

arXiv:2405.12658·cs.LG·May 22, 2024

Mitigating Overconfidence in Out-of-Distribution Detection by Capturing Extreme Activations

Mohammad Azizmalayeri, Ameen Abu-Hanna, Giovanni Cin\`a

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to improve out-of-distribution detection by measuring extreme activations in neural networks, effectively reducing overconfidence issues across various models and datasets.

Contribution

The authors propose using extreme activation values as a proxy for overconfidence, enhancing OOD detection performance without degrading accuracy.

Findings

01

Significant improvements in OOD detection AUC across multiple datasets and architectures.

02

Method does not negatively impact in-distribution classification performance.

03

Applicable to various neural network architectures and training scenarios.

Abstract

Detecting out-of-distribution (OOD) instances is crucial for the reliable deployment of machine learning models in real-world scenarios. OOD inputs are commonly expected to cause a more uncertain prediction in the primary task; however, there are OOD cases for which the model returns a highly confident prediction. This phenomenon, denoted as "overconfidence", presents a challenge to OOD detection. Specifically, theoretical evidence indicates that overconfidence is an intrinsic property of certain neural network architectures, leading to poor OOD detection. In this work, we address this issue by measuring extreme activation values in the penultimate layer of neural networks and then leverage this proxy of overconfidence to improve on several OOD detection baselines. We test our method on a wide array of experiments spanning synthetic data and real-world data, tabular and image datasets,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mazizmalayeri/CEA
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Distributed Sensor Networks and Detection Algorithms

MethodsAttention Is All You Need · Average Pooling · Global Average Pooling · Dense Connections · Linear Layer · Position-Wise Feed-Forward Layer · Convolution · Label Smoothing · Residual Connection · Absolute Position Encodings