NLP-ADBench: NLP Anomaly Detection Benchmark
Yuangang Li, Jiaqi Li, Zhuo Xiao, Tiankai Yang, Yi Nian, Xiyang Hu, Yue Zhao

TL;DR
NLP-ADBench is a comprehensive benchmark for NLP anomaly detection, evaluating multiple datasets and algorithms, revealing that two-step transformer-based methods with OpenAI embeddings perform best overall.
Contribution
This paper introduces NLP-ADBench, the first extensive benchmark for NLP anomaly detection, including datasets, algorithms, and insights into model performance.
Findings
No single model dominates across all datasets.
Two-step methods with transformer embeddings outperform end-to-end models.
OpenAI embeddings outperform BERT in anomaly detection tasks.
Abstract
Anomaly detection (AD) is an important machine learning task with applications in fraud detection, content moderation, and user behavior analysis. However, AD is relatively understudied in a natural language processing (NLP) context, limiting its effectiveness in detecting harmful content, phishing attempts, and spam reviews. We introduce NLP-ADBench, the most comprehensive NLP anomaly detection (NLP-AD) benchmark to date, which includes eight curated datasets and 19 state-of-the-art algorithms. These span 3 end-to-end methods and 16 two-step approaches that adapt classical, non-AD methods to language embeddings from BERT and OpenAI. Our empirical results show that no single model dominates across all datasets, indicating a need for automated model selection. Moreover, two-step methods with transformer-based embeddings consistently outperform specialized end-to-end approaches, with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAnomaly Detection Techniques and Applications
MethodsAttention Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Softmax · Multi-Head Attention · Weight Decay · WordPiece · Linear Warmup With Linear Decay · Dropout · Dense Connections
