Copula-based anomaly scoring and localization for large-scale, high-dimensional continuous data
G\'abor Horv\'ath, Edith Kov\'acs, Roland Molontay, Szabolcs, Nov\'aczki

TL;DR
This paper introduces a copula-based anomaly detection method that not only identifies anomalies in high-dimensional data but also localizes their causes, handling large datasets with missing values effectively.
Contribution
It presents a novel, scalable, model-based approach using copula functions for anomaly scoring and localization in high-dimensional, fat-tailed data with missing values.
Findings
Effective localization of anomalies in complex systems
Handles large-scale, high-dimensional data efficiently
Demonstrated on telecommunication network data
Abstract
The anomaly detection method presented by this paper has a special feature: it does not only indicate whether an observation is anomalous or not but also tells what exactly makes an anomalous observation unusual. Hence, it provides support to localize the reason of the anomaly. The proposed approach is model-based; it relies on the multivariate probability distribution associated with the observations. Since the rare events are present in the tails of the probability distributions, we use copula functions, that are able to model the fat-tailed distributions well. The presented procedure scales well; it can cope with a large number of high-dimensional samples. Furthermore, our procedure can cope with missing values, too, which occur frequently in high-dimensional data sets. In the second part of the paper, we demonstrate the usability of the method through a case study, where we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
