Mining Flipping Correlations from Large Datasets with Taxonomies

Marina Barsky; Sangkyum Kim; Tim Weninger; Jiawei Han

arXiv:1201.0233·cs.DB·March 19, 2015

Mining Flipping Correlations from Large Datasets with Taxonomies

Marina Barsky, Sangkyum Kim, Tim Weninger, Jiawei Han

PDF

Open Access

TL;DR

This paper introduces flipping correlation patterns that reveal surprising positive and negative correlations at different abstraction levels, along with an efficient algorithm to find such patterns in large datasets, uncovering non-redundant and actionable insights.

Contribution

The paper presents the Flipper algorithm for efficiently mining flipping correlation patterns, a novel pattern type contrasting correlations across abstraction levels.

Findings

01

Flipper outperforms naive methods by several orders of magnitude.

02

Discovered patterns are non-redundant, surprising, and actionable.

03

Effective in low-to-medium support itemsets where existing techniques fail.

Abstract

In this paper we introduce a new type of pattern -- a flipping correlation pattern. The flipping patterns are obtained from contrasting the correlations between items at different levels of abstraction. They represent surprising correlations, both positive and negative, which are specific for a given abstraction level, and which "flip" from positive to negative and vice versa when items are generalized to a higher level of abstraction. We design an efficient algorithm for finding flipping correlations, the Flipper algorithm, which outperforms naive pattern mining methods by several orders of magnitude. We apply Flipper to real-life datasets and show that the discovered patterns are non-redundant, surprising and actionable. Flipper finds strong contrasting correlations in itemsets with low-to-medium support, while existing techniques cannot handle the pattern discovery in this frequency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Rough Sets and Fuzzy Logic · Imbalanced Data Classification Techniques