Parallel Algorithm for Frequent Itemset Mining on Intel Many-core Systems
Mikhail Zymbler

TL;DR
This paper presents a parallel implementation of the Dynamic Itemset Counting (DIC) algorithm for frequent itemset mining on Intel Xeon Phi many-core systems, achieving improved performance and scalability by exploiting thread-level parallelism and vectorization.
Contribution
The paper introduces a novel parallel implementation of DIC optimized for many-core systems using OpenMP and bit-based data layouts, enhancing efficiency and scalability.
Findings
Good performance on Intel Xeon and Xeon Phi platforms
Scalability with large synthetic and real databases
Effective use of vectorization and thread parallelism
Abstract
Frequent itemset mining leads to the discovery of associations and correlations among items in large transactional databases. Apriori is a classical frequent itemset mining algorithm, which employs iterative passes over database combining with generation of candidate itemsets based on frequent itemsets found at the previous iteration, and pruning of clearly infrequent itemsets. The Dynamic Itemset Counting (DIC) algorithm is a variation of Apriori, which tries to reduce the number of passes made over a transactional database while keeping the number of itemsets counted in a pass relatively low. In this paper, we address the problem of accelerating DIC on the Intel Xeon Phi many-core system for the case when the transactional database fits in main memory. Intel Xeon Phi provides a large number of small compute cores with vector processing units. The paper presents a parallel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Rough Sets and Fuzzy Logic · Algorithms and Data Compression
