AnomalyMatch: Discovering Rare Objects of Interest with Semi-supervised and Active Learning
Pablo G\'omez, Laslo E. Ruhberg, Maria Teresa Nardone, David O'Ryan

TL;DR
AnomalyMatch is a scalable semi-supervised and active learning framework for anomaly detection in large datasets, effectively identifying rare objects with minimal labelled data, demonstrated on astronomical and natural image datasets.
Contribution
This paper introduces AnomalyMatch, a novel anomaly detection framework combining semi-supervised FixMatch and active learning tailored for large-scale, label-scarce datasets.
Findings
Achieves high AUROC and AUPRC with minimal labelled anomalies
Effective active learning cycles improve anomaly ranking precision
Performs comparably to established software on galaxy datasets
Abstract
Anomaly detection in large datasets is essential in astronomy and computer vision. However, due to a scarcity of labelled data, it is often infeasible to apply supervised methods to anomaly detection. We present AnomalyMatch, an anomaly detection framework combining the semi-supervised FixMatch algorithm using EfficientNet classifiers with active learning. AnomalyMatch is tailored for large-scale applications and integrated into the ESA Datalabs science platform. In this method, we treat anomaly detection as a binary classification problem and efficiently utilise limited labelled and abundant unlabelled images for training. We enable active learning via a user interface for verification of high-confidence anomalies and correction of false positives. Evaluations on the GalaxyMNIST astronomical dataset and the miniImageNet natural-image benchmark under severe class imbalance display…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Anomaly Detection Techniques and Applications · Algorithms and Data Compression
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Pointwise Convolution · Depthwise Convolution · Depthwise Separable Convolution · Sigmoid Activation · Convolution · Batch Normalization · RMSProp · Dense Connections · Squeeze-and-Excitation Block
