Holistic Sentence Embeddings for Better Out-of-Distribution Detection
Sishuo Chen, Xiaohan Bi, Rundong Gao, Xu Sun

TL;DR
This paper introduces Avg-Avg, a simple yet effective holistic sentence embedding method that improves out-of-distribution detection by leveraging token averaging across all layers of pretrained language models, surpassing state-of-the-art performance.
Contribution
The paper proposes a novel embedding approach, Avg-Avg, which enhances OOD detection by utilizing token representations from all intermediate layers, preserving linguistic knowledge with minimal additional cost.
Findings
Avg-Avg significantly surpasses state-of-the-art in OOD detection benchmarks.
Token averaging across layers helps preserve linguistic knowledge.
The method benefits background shift detection with negligible extra costs.
Abstract
Detecting out-of-distribution (OOD) instances is significant for the safe deployment of NLP models. Among recent textual OOD detection works based on pretrained language models (PLMs), distance-based methods have shown superior performance. However, they estimate sample distance scores in the last-layer CLS embedding space and thus do not make full use of linguistic information underlying in PLMs. To address the issue, we propose to boost OOD detection by deriving more holistic sentence embeddings. On the basis of the observations that token averaging and layer combination contribute to improving OOD detection, we propose a simple embedding approach named Avg-Avg, which averages all token representations from each intermediate layer as the sentence embedding and significantly surpasses the state-of-the-art on a comprehensive suite of benchmarks by a 9.33% FAR95 margin. Furthermore, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
