F-tree: an algorithm for clustering transactional data using frequency tree
Mahmoud Mahdi, Samir Abdelrahman, Reem Bahgat, and Ismail Ismail

TL;DR
The paper introduces F-Tree, a fast and dynamic clustering algorithm for high-dimensional transactional data, utilizing a frequent pattern tree structure to generate and merge small pure clusters, effectively handling overlaps and large datasets.
Contribution
It proposes a novel F-Tree algorithm that improves clustering speed and quality for large, high-dimensional transactional data by merging small pure clusters and addressing overlaps.
Findings
F-Tree effectively finds interesting clusters.
Tree structure reduces clustering time on large datasets.
New overlap criterion improves cluster quality.
Abstract
Clustering is an important data mining technique that groups similar data records, recently categorical transaction clustering is received more attention. In this research, we study the problem of categorical data clustering for transactional data characterized with high dimensionality and large volume. We propose a novel algorithm for clustering transactional data called F-Tree, which is based on the idea of the frequent pattern algorithm FP-tree; the fastest approaches to the frequent item set mining. And the simple idea behind the F-Tree is to generate small high pure clusters, and then merge them. That makes it fast, and dynamic in clustering large transactional datasets with high dimensions. We also present a new solution to solve the overlapping problem between clusters, by defining a new criterion function, which is based on the probability of overlapping between weighted items.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Advanced Clustering Algorithms Research · Data Management and Algorithms
