Learning to Rank Anomalies: Scalar Performance Criteria and Maximization   of Two-Sample Rank Statistics

Myrto Limnios (CB); Nathan Noiry; St\'ephan Cl\'emen\c{c}on (IDS)

arXiv:2109.09590·math.ST·September 21, 2021·KDD

Learning to Rank Anomalies: Scalar Performance Criteria and Maximization of Two-Sample Rank Statistics

Myrto Limnios (CB), Nathan Noiry, St\'ephan Cl\'emen\c{c}on (IDS)

PDF

Open Access

TL;DR

This paper introduces a data-driven scoring method for outlier detection that leverages two-sample rank statistics, supported by theoretical insights and preliminary numerical experiments.

Contribution

It proposes a novel outlier detection approach using a learned scoring function based on two-sample rank statistics with theoretical backing.

Findings

01

Method shows promising preliminary results.

02

The scoring function effectively reflects abnormality.

03

Theoretical analysis supports the approach.

Abstract

The ability to collect and store ever more massive databases has been accompanied by the need to process them efficiently. In many cases, most observations have the same behavior, while a probable small proportion of these observations are abnormal. Detecting the latter, defined as outliers, is one of the major challenges for machine learning applications (e.g. in fraud detection or in predictive maintenance). In this paper, we propose a methodology addressing the problem of outlier detection, by learning a data-driven scoring function defined on the feature space which reflects the degree of abnormality of the observations. This scoring function is learnt through a well-designed binary classification problem whose empirical criterion takes the form of a two-sample linear rank statistics on which theoretical results are available. We illustrate our methodology with preliminary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Advanced Statistical Methods and Models · Imbalanced Data Classification Techniques