A FP-Tree Based Approach for Mining All Strongly Correlated Pairs without Candidate Generation
Zengyou He, Xiaofei Xu, Shengchun Deng

TL;DR
This paper introduces Tcp, an efficient FP-tree based algorithm for mining all strongly correlated item pairs, significantly outperforming previous methods like Taper, especially on large datasets.
Contribution
The paper presents Tcp, a novel FP-tree based algorithm that efficiently mines all strongly correlated pairs without candidate generation, improving performance over existing methods.
Findings
Tcp outperforms Taper on synthetic datasets.
Tcp is more efficient on large real-world datasets.
Performance gains are significant across various correlation thresholds.
Abstract
Given a user-specified minimum correlation threshold and a transaction database, the problem of mining all-strong correlated pairs is to find all item pairs with Pearson's correlation coefficients above the threshold . Despite the use of upper bound based pruning technique in the Taper algorithm [1], when the number of items and transactions are very large, candidate pair generation and test is still costly. To avoid the costly test of a large number of candidate pairs, in this paper, we propose an efficient algorithm, called Tcp, based on the well-known FP-tree data structure, for mining the complete set of all-strong correlated item pairs. Our experimental results on both synthetic and real world datasets show that, Tcp's performance is significantly better than that of the previously developed Taper algorithm over practical ranges of correlation threshold specifications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications
