Effective Discovery of Meaningful Outlier Relationships

Aline Bessa; Juliana Freire; Divesh Srivastava; Tamraparni Dasu

arXiv:1910.08678·cs.DB·April 10, 2020

Effective Discovery of Meaningful Outlier Relationships

Aline Bessa, Juliana Freire, Divesh Srivastava, Tamraparni Dasu

PDF

TL;DR

This paper introduces PODS, a scalable method for discovering meaningful, predictable relationships among outliers in temporal data sets, enhancing explainability and robustness in outlier analysis.

Contribution

The paper formalizes the concept of meaningful outlier relationships, develops a criterion for their predictability, and proposes an indexing strategy for scalable discovery.

Findings

01

Demonstrates effectiveness on real datasets

02

Shows robustness against different data variations

03

Proves scalability with large data collections

Abstract

We propose PODS (Predictable Outliers in Data-trendS), a method that, given a collection of temporal data sets, derives data-driven explanations for outliers by identifying meaningful relationships between them. First, we formalize the notion of meaningfulness, which so far has been informally framed in terms of explainability. Next, since outliers are rare and it is difficult to determine whether their relationships are meaningful, we develop a new criterion that does so by checking if these relationships could have been predicted from non-outliers, i.e., if we could see the outlier relationships coming. Finally, searching for meaningful outlier relationships between every pair of data sets in a large data collection is computationally infeasible. To address that, we propose an indexing strategy that prunes irrelevant comparisons across data sets, making the approach scalable. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.