Outlying Property Detection with Numerical Attributes

Fabrizio Angiulli; Fabio Fassetti; Luigi Palopoli; Giuseppe; Manco

arXiv:1306.3558·cs.LG·June 18, 2013·1 cites

Outlying Property Detection with Numerical Attributes

Fabrizio Angiulli, Fabio Fassetti, Luigi Palopoli, Giuseppe, Manco

PDF

Open Access

TL;DR

This paper addresses outlier detection in databases with numerical attributes by introducing a measure of outlierness and an efficient algorithm that explains outliers through rule-based data subsets.

Contribution

It presents a novel measure of outlierness for numerical data and an efficient algorithm for computing and explaining outliers using rule-based subsets.

Findings

01

The measure effectively quantifies outlierness based on likelihood comparisons.

02

The algorithm efficiently identifies significant data subsets related to outliers.

03

The approach provides interpretable explanations for outlier detection.

Abstract

The outlying property detection problem is the problem of discovering the properties distinguishing a given object, known in advance to be an outlier in a database, from the other database objects. In this paper, we analyze the problem within a context where numerical attributes are taken into account, which represents a relevant case left open in the literature. We introduce a measure to quantify the degree the outlierness of an object, which is associated with the relative likelihood of the value, compared to the to the relative likelihood of other objects in the database. As a major contribution, we present an efficient algorithm to compute the outlierness relative to significant subsets of the data. The latter subsets are characterized in a "rule-based" fashion, and hence the basis for the underlying explanation of the outlierness.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Advanced Statistical Methods and Models · Imbalanced Data Classification Techniques