Effective Discovery of Meaningful Outlier Relationships
Aline Bessa, Juliana Freire, Divesh Srivastava, Tamraparni Dasu

TL;DR
This paper introduces PODS, a scalable method for discovering meaningful, predictable relationships among outliers in temporal data sets, enhancing explainability and robustness in outlier analysis.
Contribution
The paper formalizes the concept of meaningful outlier relationships, develops a criterion for their predictability, and proposes an indexing strategy for scalable discovery.
Findings
Demonstrates effectiveness on real datasets
Shows robustness against different data variations
Proves scalability with large data collections
Abstract
We propose PODS (Predictable Outliers in Data-trendS), a method that, given a collection of temporal data sets, derives data-driven explanations for outliers by identifying meaningful relationships between them. First, we formalize the notion of meaningfulness, which so far has been informally framed in terms of explainability. Next, since outliers are rare and it is difficult to determine whether their relationships are meaningful, we develop a new criterion that does so by checking if these relationships could have been predicted from non-outliers, i.e., if we could see the outlier relationships coming. Finally, searching for meaningful outlier relationships between every pair of data sets in a large data collection is computationally infeasible. To address that, we propose an indexing strategy that prunes irrelevant comparisons across data sets, making the approach scalable. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
