Unsupervised Instance Selection with Low-Label, Supervised Learning for Outlier Detection
Trent J. Bradberry, Christopher H. Hase, LeAnna Kent, Joel A., G\'ongora

TL;DR
This paper introduces UNISEL, an unsupervised instance selection method, which, when combined with supervised learning, effectively detects outliers with minimal labeled data, outperforming traditional active learning in certain scenarios.
Contribution
The paper proposes UNISEL, an unsupervised instance selection technique that enhances outlier detection efficiency, especially under low-label conditions, and demonstrates its effectiveness compared to active learning.
Findings
UNISEL performs comparably to active learning with fewer labels.
Combining UNISEL with active learning yields superior outlier detection results.
UNISEL offers practical time savings and better generalizability.
Abstract
The laborious process of labeling data often bottlenecks projects that aim to leverage the power of supervised machine learning. Active Learning (AL) has been established as a technique to ameliorate this condition through an iterative framework that queries a human annotator for labels of instances with the most uncertain class assignment. Via this mechanism, AL produces a binary classifier trained on less labeled data but with little, if any, loss in predictive performance. Despite its advantages, AL can have difficulty with class-imbalanced datasets and results in an inefficient labeling process. To address these drawbacks, we investigate our unsupervised instance selection (UNISEL) technique followed by a Random Forest (RF) classifier on 10 outlier detection datasets under low-label conditions. These results are compared to AL performed on the same datasets. Further, we investigate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Anomaly Detection Techniques and Applications
