
TL;DR
This paper introduces a novel family of data depths called 'loss depths' that interpret data centrality as classifier risk, enabling efficient high-dimensional anomaly detection and connecting data geometry with classifier complexity.
Contribution
It extends traditional data depth by framing it as classifier risk, allowing the use of machine learning algorithms and facilitating high-dimensional data analysis.
Findings
Loss depths can be computed efficiently using existing classifiers.
They perform well in anomaly detection tasks.
The framework connects data centrality with classifier complexity.
Abstract
Data depths are score functions that quantify in an unsupervised fashion how central is a point inside a distribution, with numerous applications such as anomaly detection, multivariate or functional data analysis, arising across various fields. The halfspace depth was the first depth to aim at generalising the notion of quantile beyond the univariate case. Among the existing variety of depth definitions, it remains one of the most used notions of data depth. Taking a different angle from the quantile point of view, we show that the halfspace depth can also be regarded as the minimum loss of a set of classifiers for a specific labelling of the points. By changing the loss or the set of classifiers considered, this new angle naturally leads to a family of "loss depths", extending to well-studied classifiers such as, e.g., SVM or logistic regression, among others. This framework directly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
