Interpretable Anomaly Detection with Mondrian P{\'o}lya Forests on Data   Streams

Charlie Dickens; Eric Meissner; Pablo G. Moreno; Tom Diethe

arXiv:2008.01505·cs.LG·August 5, 2020

Interpretable Anomaly Detection with Mondrian P{\'o}lya Forests on Data Streams

Charlie Dickens, Eric Meissner, Pablo G. Moreno, Tom Diethe

PDF

Open Access

TL;DR

This paper introduces the Mondrian Polya Forest, a probabilistic and interpretable method for anomaly detection in high-dimensional data streams, achieving state-of-the-art results while offering better interpretability of anomalies.

Contribution

It presents a novel probabilistic framework for anomaly detection using Mondrian Polya Forests, improving interpretability and efficiency in streaming data environments.

Findings

01

Achieves state-of-the-art anomaly detection performance.

02

Provides statistically interpretable anomaly scores.

03

Operates efficiently in streaming environments.

Abstract

Anomaly detection at scale is an extremely challenging problem of great practicality. When data is large and high-dimensional, it can be difficult to detect which observations do not fit the expected behaviour. Recent work has coalesced on variations of (random) $k$ \emph{d-trees} to summarise data for anomaly detection. However, these methods rely on ad-hoc score functions that are not easy to interpret, making it difficult to asses the severity of the detected anomalies or select a reasonable threshold in the absence of labelled anomalies. To solve these issues, we contextualise these methods in a probabilistic framework which we call the Mondrian \Polya{} Forest for estimating the underlying probability density function generating the data and enabling greater interpretability than prior work. In addition, we develop a memory efficient variant able to operate in the modern streaming…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Data Stream Mining Techniques · Network Security and Intrusion Detection

MethodsInterpretability