A Guided FP-growth algorithm for multitude-targeted mining of big data
Lior Shabtay, Rami Yaari, Itai Dattner

TL;DR
This paper introduces the GFP-growth algorithm, an efficient method for targeted mining of large itemsets in big data, with applications in imbalanced data scenarios like medical and security domains.
Contribution
The paper presents GFP-growth, a novel exact frequency-counting algorithm for multitude-targeted mining, and the Minority-Report Algorithm that leverages GFP-growth for improved performance in imbalanced data.
Findings
GFP-growth efficiently computes exact counts for large sets of itemsets.
Minority-Report Algorithm demonstrates significant performance improvements.
Validated on simulations and real-world data.
Abstract
In this paper we present the GFP-growth (Guided FP-growth) algorithm, a novel method for multitude-targeted mining: finding the count of a given large list of itemsets in large data. The GFP-growth algorithm is designed to focus on the specific multitude itemsets of interest and optimizes the time and memory costs. We prove that the GFP-growth algorithm yields the exact frequency-counts for the required itemsets. We show that for a number of different problems, a solution can be devised which takes advantage of the efficient implementation of multitude-targeted mining for boosting the performance. In particular, we study in detail the problem of generating the minority-class rules from imbalanced data, a scenario that appears in many real-life domains such as medical applications, failure prediction, network and cyber security, and maintenance. We develop the Minority-Report Algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications
