How Good Are LLMs at Out-of-Distribution Detection?

Bo Liu; Liming Zhan; Zexin Lu; Yujie Feng; Lei Xue; Xiao-Ming Wu

arXiv:2308.10261·cs.CL·April 17, 2024·1 cites

How Good Are LLMs at Out-of-Distribution Detection?

Bo Liu, Liming Zhan, Zexin Lu, Yujie Feng, Lei Xue, Xiao-Ming Wu

PDF

Open Access 1 Repo

TL;DR

This paper empirically evaluates out-of-distribution detection methods on large language models, revealing that cosine distance performs best due to the isotropic nature of LLM embeddings, thus improving their reliability.

Contribution

It pioneers OOD detection evaluation on LLMs like LLaMA, demonstrating the effectiveness of cosine distance and analyzing embedding space properties.

Findings

01

Cosine distance outperforms other OOD detectors on LLMs.

02

LLM embeddings are more isotropic compared to smaller models.

03

Generative fine-tuning aligns better with OOD detection objectives.

Abstract

Out-of-distribution (OOD) detection plays a vital role in enhancing the reliability of machine learning (ML) models. The emergence of large language models (LLMs) has catalyzed a paradigm shift within the ML community, showcasing their exceptional capabilities across diverse natural language processing tasks. While existing research has probed OOD detection with relative small-scale Transformers like BERT, RoBERTa and GPT-2, the stark differences in scales, pre-training objectives, and inference paradigms call into question the applicability of these findings to LLMs. This paper embarks on a pioneering empirical investigation of OOD detection in the domain of LLMs, focusing on LLaMA series ranging from 7B to 65B in size. We thoroughly evaluate commonly-used OOD detectors, scrutinizing their performance in both zero-grad and fine-tuning scenarios. Notably, we alter previous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

awenbocc/llm-ood
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Warmup With Cosine Annealing · Discriminative Fine-Tuning · Byte Pair Encoding · Adam · Attention Dropout · Linear Layer · Layer Normalization