Mining Top-K Co-Occurrence Items
Zhi-Hong Deng

TL;DR
This paper introduces a new task called top-k co-occurrence item mining, proposing efficient algorithms and data structures to identify the most frequently co-occurring items with a given set, outperforming baseline methods.
Contribution
The paper presents a novel mining task, Pi-Tree data structure, and two algorithms, PT and PT-TA, that significantly improve efficiency and scalability in top-k co-occurrence item mining.
Findings
PT algorithm outperforms baseline algorithms in execution time.
PT-TA with pruning further enhances efficiency.
Algorithms demonstrate excellent scalability on synthetic and real data.
Abstract
Frequent itemset mining has emerged as a fundamental problem in data mining and plays an important role in many data mining tasks, such as association analysis, classification, etc. In the framework of frequent itemset mining, the results are itemsets that are frequent in the whole database. However, in some applications, such recommendation systems and social networks, people are more interested in finding out the items that occur with some user-specified itemsets (query itemsets) most frequently in a database. In this paper, we address the problem by proposing a new mining task named top-k co-occurrence item mining, where k is the desired number of items to be found. Four baseline algorithms are presented first. Then, we introduce a special data structure named Pi-Tree (Prefix itemset Tree) to maintain the information of itemsets. Based on Pi-Tree, we propose two algorithms, namely PT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Data Management and Algorithms · Rough Sets and Fuzzy Logic
