Self-Supervised Losses for One-Class Textual Anomaly Detection
Kimberly T. Mai, Toby Davies, Lewis D. Griffin

TL;DR
This paper introduces a simple self-supervised approach using fine-tuned Transformers for one-class textual anomaly detection, outperforming existing methods across various anomaly types.
Contribution
It proposes a self-supervised loss-based method for anomaly detection that avoids complex architectures and supervision, demonstrating improved performance.
Findings
Self-supervised losses outperform other methods in anomaly detection.
Performance varies with the type of anomaly and the learned representation.
Different objectives are optimal for semantic versus syntactic anomalies.
Abstract
Current deep learning methods for anomaly detection in text rely on supervisory signals in inliers that may be unobtainable or bespoke architectures that are difficult to tune. We study a simpler alternative: fine-tuning Transformers on the inlier data with self-supervised objectives and using the losses as an anomaly score. Overall, the self-supervision approach outperforms other methods under various anomaly detection scenarios, improving the AUROC score on semantic anomalies by 11.6% and on syntactic anomalies by 22.8% on average. Additionally, the optimal objective and resultant learnt representation depend on the type of downstream anomaly. The separability of anomalies and inliers signals that a representation is more effective for detecting semantic anomalies, whilst the presence of narrow feature directions signals a representation that is effective for detecting syntactic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Advanced Malware Detection Techniques · Network Security and Intrusion Detection
