Mitigating Neural Network Overconfidence with Logit Normalization
Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, Yixuan Li

TL;DR
This paper introduces Logit Normalization (LogitNorm), a simple method to reduce neural network overconfidence by fixing logit vector norms during training, significantly improving out-of-distribution detection performance.
Contribution
Proposes LogitNorm, a novel normalization technique that decouples logit norm influence, enhancing confidence calibration and out-of-distribution detection in neural networks.
Findings
Reduces FPR95 by up to 42.30% on benchmarks.
Produces highly distinguishable confidence scores for in- and out-of-distribution data.
Demonstrates superiority over existing methods in experiments.
Abstract
Detecting out-of-distribution inputs is critical for safe deployment of machine learning models in the real world. However, neural networks are known to suffer from the overconfidence issue, where they produce abnormally high confidence for both in- and out-of-distribution inputs. In this work, we show that this issue can be mitigated through Logit Normalization (LogitNorm) -- a simple fix to the cross-entropy loss -- by enforcing a constant vector norm on the logits in training. Our method is motivated by the analysis that the norm of the logit keeps increasing during training, leading to overconfident output. Our key idea behind LogitNorm is thus to decouple the influence of output's norm during network optimization. Trained with LogitNorm, neural networks produce highly distinguishable confidence scores between in- and out-of-distribution data. Extensive experiments demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Data Classification · Advanced Neural Network Applications
