Sparse, self-organizing ensembles of local kernels detect rare statistical anomalies
Gaia Grosso, Sai Sumedh R. Hindupur, Thomas Fel, Samuel Bright-Thonney, Philip Harris, Demba Ba

TL;DR
This paper introduces SparKer, a sparse ensemble of local kernels designed for anomaly detection in high-dimensional data, emphasizing self-organization, efficiency, and interpretability to identify rare anomalies effectively.
Contribution
The paper proposes a novel self-organizing kernel ensemble method, SparKer, that adaptively models likelihood ratios for anomaly detection with theoretical insights and practical validation.
Findings
Ensembles with few kernels detect anomalies in thousands of dimensions.
SparKer effectively identifies statistically significant anomalies.
The approach is scalable and interpretable across scientific domains.
Abstract
Modern artificial intelligence has revolutionized our ability to extract rich and versatile data representations across scientific disciplines. Yet, the statistical properties of these representations remain poorly controlled, causing misspecified anomaly detection (AD) methods to falter. Weak or rare signals can remain hidden within the apparent regularity of normal data, creating a gap in our ability to detect and interpret anomalies. We examine this gap and identify a set of structural desiderata for detection methods operating under minimal prior information: sparsity, to enforce parsimony; locality, to preserve geometric sensitivity; and competition, to promote efficient allocation of model capacity. These principles define a class of self-organizing local kernels that adaptively partition the representation space around regions of statistical imbalance. As an instantiation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Imbalanced Data Classification Techniques · Statistical Mechanics and Entropy
