Toward Unsupervised Outlier Model Selection
Yue Zhao, Sean Zhang, Leman Akoglu

TL;DR
ELECT is a novel meta-learning approach for unsupervised outlier model selection that uses a performance-based dataset similarity measure to effectively choose outlier detection algorithms without labels, outperforming existing methods.
Contribution
The paper introduces ELECT, a new meta-learning method for unsupervised outlier model selection that employs a performance-based similarity measure and adaptive search, addressing a key gap in the field.
Findings
ELECT significantly outperforms baseline UOMS methods.
It effectively adapts to varying time budgets.
It leverages historical dataset performance for model selection.
Abstract
Today there exists no shortage of outlier detection algorithms in the literature, yet the complementary and critical problem of unsupervised outlier model selection (UOMS) is vastly understudied. In this work we propose ELECT, a new approach to select an effective candidate model, i.e. an outlier detection algorithm and its hyperparameter(s), to employ on a new dataset without any labels. At its core, ELECT is based on meta-learning; transferring prior knowledge (e.g. model performance) on historical datasets that are similar to the new one to facilitate UOMS. Uniquely, it employs a dataset similarity measure that is performance-based, which is more direct and goal-driven than other measures used in the past. ELECT adaptively searches for similar historical datasets, as such, it can serve an output on-demand, being able to accommodate varying time budgets. Extensive experiments show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Water Systems and Optimization · Machine Learning and Data Classification
