Text-ADBench: Text Anomaly Detection Benchmark based on LLMs Embedding

Feng Xiao; Jicong Fan

arXiv:2507.12295·cs.CL·July 17, 2025

Text-ADBench: Text Anomaly Detection Benchmark based on LLMs Embedding

Feng Xiao, Jicong Fan

PDF

Open Access 1 Datasets

TL;DR

This paper introduces a comprehensive benchmark for text anomaly detection using embeddings from various pre-trained language models, revealing insights on embedding quality and model evaluation strategies.

Contribution

It provides the first standardized benchmark for text anomaly detection with diverse LLM embeddings, enabling rigorous comparison and fostering future research.

Findings

01

Embedding quality critically affects detection performance.

02

Deep learning models do not outperform shallow algorithms with LLM embeddings.

03

Low-rank characteristics in performance matrices facilitate efficient model evaluation.

Abstract

Text anomaly detection is a critical task in natural language processing (NLP), with applications spanning fraud detection, misinformation identification, spam detection and content moderation, etc. Despite significant advances in large language models (LLMs) and anomaly detection algorithms, the absence of standardized and comprehensive benchmarks for evaluating the existing anomaly detection methods on text data limits rigorous comparison and development of innovative approaches. This work performs a comprehensive empirical study and introduces a benchmark for text anomaly detection, leveraging embeddings from diverse pre-trained language models across a wide array of text datasets. Our work systematically evaluates the effectiveness of embedding-based text anomaly detection by incorporating (1) early language models (GloVe, BERT); (2) multiple LLMs (LLaMa-2, LLama-3, Mistral, OpenAI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Feng-001/Text-ADBench
dataset· 50 dl
50 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection · Topic Modeling · Advanced Malware Detection Techniques