Itemset Utility Maximization with Correlation Measure

Jiahui Chen; Yixin Xu; Shicheng Wan; Wensheng Gan; and Jerry Chun-Wei; Lin

arXiv:2208.12551·cs.AI·August 29, 2022·1 cites

Itemset Utility Maximization with Correlation Measure

Jiahui Chen, Yixin Xu, Shicheng Wan, Wensheng Gan, and Jerry Chun-Wei, Lin

PDF

Open Access

TL;DR

This paper introduces CoIUM, an algorithm for high utility itemset mining that incorporates item correlation, improving efficiency by pruning and database projection, and outperforms existing methods in speed and memory use.

Contribution

The paper presents a novel algorithm, CoIUM, which considers both correlation and utility, employing new pruning strategies and data structures for efficient high utility itemset mining.

Findings

01

CoIUM significantly reduces runtime compared to state-of-the-art algorithms.

02

The algorithm effectively decreases memory consumption during mining.

03

Experimental results validate CoIUM's superior performance on various datasets.

Abstract

As an important data mining technology, high utility itemset mining (HUIM) is used to find out interesting but hidden information (e.g., profit and risk). HUIM has been widely applied in many application scenarios, such as market analysis, medical detection, and web click stream analysis. However, most previous HUIM approaches often ignore the relationship between items in an itemset. Therefore, many irrelevant combinations (e.g., \{gold, apple\} and \{notebook, book\}) are discovered in HUIM. To address this limitation, many algorithms have been proposed to mine correlated high utility itemsets (CoHUIs). In this paper, we propose a novel algorithm called the Itemset Utility Maximization with Correlation Measure (CoIUM), which considers both a strong correlation and the profitable values of the items. Besides, the novel algorithm adopts a database projection mechanism to reduce the cost…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Customer churn and segmentation · Imbalanced Data Classification Techniques

MethodsPruning