Explainable outlier detection through decision tree conditioning

David Cortes

arXiv:2001.00636·stat.ML·January 6, 2020·1 cites

Explainable outlier detection through decision tree conditioning

David Cortes

PDF

Open Access 2 Repos

TL;DR

This paper introduces OutlierTree, an explainable outlier detection method based on decision tree splits that provides human-readable justifications for outlier status by analyzing branch conditions and distribution statistics.

Contribution

It presents a novel outlier detection approach that combines decision tree conditioning with interpretability, enabling understandable explanations for outliers.

Findings

01

Produces human-readable explanations for outliers.

02

Utilizes supervised decision tree splits to ensure logical outlier conditions.

03

Leverages distribution statistics within tree branches for outlier assessment.

Abstract

This work describes an outlier detection procedure (named "OutlierTree") loosely based on the GritBot software developed by RuleQuest research, which works by evaluating and following supervised decision tree splits on variables, in whose branches 1-d confidence intervals are constructed for the target variable and potential outliers flagged according to these confidence intervals. Under this logic, it's possible to produce human-readable explanations for why a given value of a variable in an observation can be considered as outlier, by considering the decision tree branch conditions along with general distribution statistics among the non-outlier observations that fell into the same branch, which can then be contrasted against the value which lies outside the CI. The supervised splits help to ensure that the generated conditions are not spurious, but rather related to the target…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Imbalanced Data Classification Techniques · Advanced Statistical Methods and Models