Entropic Issues in Likelihood-Based OOD Detection
Anthony L. Caterini, Gabriel Loaiza-Ganem

TL;DR
This paper investigates why likelihood-based deep generative models sometimes assign higher likelihoods to out-of-distribution data, revealing that entropy differences influence this behavior and explaining the effectiveness of likelihood ratio methods for OOD detection.
Contribution
It introduces a novel entropy-based decomposition of likelihood, providing new insights into OOD detection challenges and the success of likelihood ratio approaches.
Findings
Entropy explains likelihood anomalies in OOD detection.
Likelihood ratio methods effectively cancel entropy effects.
Analysis relates to manifold-supported models in OOD detection.
Abstract
Deep generative models trained by maximum likelihood remain very popular methods for reasoning about data probabilistically. However, it has been observed that they can assign higher likelihoods to out-of-distribution (OOD) data than in-distribution data, thus calling into question the meaning of these likelihood values. In this work we provide a novel perspective on this phenomenon, decomposing the average likelihood into a KL divergence term and an entropy term. We argue that the latter can explain the curious OOD behaviour mentioned above, suppressing likelihood values on datasets with higher entropy. Although our idea is simple, we have not seen it explored yet in the literature. This analysis provides further explanation for the success of OOD detection methods based on likelihood ratios, as the problematic entropy term cancels out in expectation. Finally, we discuss how this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Gaussian Processes and Bayesian Inference · Adversarial Robustness in Machine Learning
