AP-OOD: Attention Pooling for Out-of-Distribution Detection

Claus Hofmann; Christian Huber; Bernhard Lehner; Daniel Klotz; Sepp Hochreiter; Werner Zellinger

arXiv:2602.06031·cs.LG·February 6, 2026

AP-OOD: Attention Pooling for Out-of-Distribution Detection

Claus Hofmann, Christian Huber, Bernhard Lehner, Daniel Klotz, Sepp Hochreiter, Werner Zellinger

PDF

Open Access

TL;DR

AP-OOD introduces a token-level aggregation method for out-of-distribution detection in NLP, significantly improving detection accuracy by leveraging limited auxiliary outlier data and surpassing previous methods.

Contribution

It presents a novel attention pooling technique for OOD detection that effectively utilizes token information and supports semi-supervised learning with limited outlier data.

Findings

01

Achieves state-of-the-art OOD detection performance on text tasks.

02

Reduces FPR95 from 27.84% to 4.67% on XSUM.

03

Reduces FPR95 from 77.08% to 70.37% on WMT15 En-Fr.

Abstract

Out-of-distribution (OOD) detection, which maps high-dimensional data into a scalar OOD score, is critical for the reliable deployment of machine learning models. A key challenge in recent research is how to effectively leverage and aggregate token embeddings from language models to obtain the OOD score. In this work, we propose AP-OOD, a novel OOD detection method for natural language that goes beyond simple average-based aggregation by exploiting token-level information. AP-OOD is a semi-supervised approach that flexibly interpolates between unsupervised and supervised settings, enabling the use of limited auxiliary outlier data. Empirically, AP-OOD sets a new state of the art in OOD detection for text: in the unsupervised setting, it reduces the FPR95 (false positive rate at 95% true positives) from 27.84% to 4.67% on XSUM summarization, and from 77.08% to 70.37% on WMT15 En-Fr…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Adversarial Robustness in Machine Learning