Holistic Sentence Embeddings for Better Out-of-Distribution Detection

Sishuo Chen; Xiaohan Bi; Rundong Gao; Xu Sun

arXiv:2210.07485·cs.CL·October 17, 2022·1 cites

Holistic Sentence Embeddings for Better Out-of-Distribution Detection

Sishuo Chen, Xiaohan Bi, Rundong Gao, Xu Sun

PDF

Open Access 1 Repo

TL;DR

This paper introduces Avg-Avg, a simple yet effective holistic sentence embedding method that improves out-of-distribution detection by leveraging token averaging across all layers of pretrained language models, surpassing state-of-the-art performance.

Contribution

The paper proposes a novel embedding approach, Avg-Avg, which enhances OOD detection by utilizing token representations from all intermediate layers, preserving linguistic knowledge with minimal additional cost.

Findings

01

Avg-Avg significantly surpasses state-of-the-art in OOD detection benchmarks.

02

Token averaging across layers helps preserve linguistic knowledge.

03

The method benefits background shift detection with negligible extra costs.

Abstract

Detecting out-of-distribution (OOD) instances is significant for the safe deployment of NLP models. Among recent textual OOD detection works based on pretrained language models (PLMs), distance-based methods have shown superior performance. However, they estimate sample distance scores in the last-layer CLS embedding space and thus do not make full use of linguistic information underlying in PLMs. To address the issue, we propose to boost OOD detection by deriving more holistic sentence embeddings. On the basis of the observations that token averaging and layer combination contribute to improving OOD detection, we propose a simple embedding approach named Avg-Avg, which averages all token representations from each intermediate layer as the sentence embedding and significantly surpasses the state-of-the-art on a comprehensive suite of benchmarks by a 9.33% FAR95 margin. Furthermore, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lancopku/avg-avg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods