Classical Out-of-Distribution Detection Methods Benchmark in Text Classification Tasks
Mateusz Baran, Joanna Baran, Mateusz W\'ojcik, Maciej Zi\k{e}ba, Adam, Gonczarek

TL;DR
This paper benchmarks eight out-of-distribution detection methods in NLP, revealing their limitations in identifying diverse distributional shifts and emphasizing the need for more sensitive approaches.
Contribution
It provides a comprehensive evaluation framework for existing OOD detection methods in NLP and highlights their current shortcomings across various challenging scenarios.
Findings
Existing methods are not sufficiently sensitive to all distributional shifts.
Background shift and word order shuffling pose significant detection challenges.
The study offers a reproducible research environment for future OOD detection research.
Abstract
State-of-the-art models can perform well in controlled environments, but they often struggle when presented with out-of-distribution (OOD) examples, making OOD detection a critical component of NLP systems. In this paper, we focus on highlighting the limitations of existing approaches to OOD detection in NLP. Specifically, we evaluated eight OOD detection methods that are easily integrable into existing NLP systems and require no additional OOD data or model modifications. One of our contributions is providing a well-structured research environment that allows for full reproducibility of the results. Additionally, our analysis shows that existing OOD detection methods for NLP tasks are not yet sufficiently sensitive to capture all samples characterized by various types of distributional shifts. Particularly challenging testing scenarios arise in cases of background shift and randomly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsFocus
