Learning Representations for Outlier Detection on a Budget
Barbora Micenkov\'a, Brian McWilliams, Ira Assent

TL;DR
This paper introduces BORE, a flexible ensemble method that combines unsupervised outlier scores with supervised learning to improve detection of rare outliers efficiently across various datasets.
Contribution
The paper presents BORE, a novel ensemble approach that effectively integrates unsupervised outlier scores into supervised models, handling class imbalance and computational constraints.
Findings
BORE outperforms competing methods on 12 real-world datasets.
BORE adapts to different outlier scoring functions and budget constraints.
Demonstrates strong performance in both non-budgeted and budgeted scenarios.
Abstract
The problem of detecting a small number of outliers in a large dataset is an important task in many fields from fraud detection to high-energy physics. Two approaches have emerged to tackle this problem: unsupervised and supervised. Supervised approaches require a sufficient amount of labeled data and are challenged by novel types of outliers and inherent class imbalance, whereas unsupervised methods do not take advantage of available labeled training examples and often exhibit poorer predictive performance. We propose BORE (a Bagged Outlier Representation Ensemble) which uses unsupervised outlier scoring functions (OSFs) as features in a supervised learning framework. BORE is able to adapt to arbitrary OSF feature representations, to the imbalance in labeled data as well as to prediction-time constraints on computational cost. We demonstrate the good performance of BORE compared to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Imbalanced Data Classification Techniques · Advanced Statistical Methods and Models
