Adapting Feature Attenuation to NLP
Tianshuo Yang, Ryan Rabinowitz, Terrance E. Boult, Jugal Kalita

TL;DR
This paper explores adapting feature attenuation methods from computer vision to NLP transformers for open-set recognition, benchmarking their effectiveness against existing methods on arXiv classification tasks.
Contribution
It adapts the COSTARR framework to NLP transformers and evaluates its performance compared to other state-of-the-art OSR scores, highlighting current limitations.
Findings
COSTARR can be applied to NLP without retraining.
No significant performance gain over MaxLogit or MSP was observed.
Free-energy score underperforms in high-class-count scenarios.
Abstract
Transformer classifiers such as BERT deliver impressive closed-set accuracy, yet they remain brittle when confronted with inputs from unseen categories--a common scenario for deployed NLP systems. We investigate Open-Set Recognition (OSR) for text by porting the feature attenuation hypothesis from computer vision to transformers and by benchmarking it against state-of-the-art baselines. Concretely, we adapt the COSTARR framework--originally designed for classification in computer vision--to two modest language models (BERT (base) and GPT-2) trained to label 176 arXiv subject areas. Alongside COSTARR, we evaluate Maximum Softmax Probability (MSP), MaxLogit, and the temperature-scaled free-energy score under the OOSA and AUOSCR metrics. Our results show (i) COSTARR extends to NLP without retraining but yields no statistically significant gain over MaxLogit or MSP, and (ii) free-energy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
