An Efficient Genetic Algorithm for Discovering Diverse-Frequent Patterns

Shanjida Khatun; Hasib Ul Alam; Swakkhar Shatabda

arXiv:1507.05275·cs.AI·July 21, 2015

An Efficient Genetic Algorithm for Discovering Diverse-Frequent Patterns

Shanjida Khatun, Hasib Ul Alam, Swakkhar Shatabda

PDF

Open Access

TL;DR

This paper introduces a fast genetic algorithm for mining diverse frequent patterns in large datasets, outperforming existing methods in efficiency and diversity.

Contribution

It presents a novel heuristic search algorithm with a unique encoding scheme and twin removal technique, enabling efficient discovery of diverse patterns.

Findings

01

Outperforms state-of-the-art methods on benchmark datasets

02

Produces diverse pattern sets within short execution times

03

Effective in large-scale pattern set mining

Abstract

Working with exhaustive search on large dataset is infeasible for several reasons. Recently, developed techniques that made pattern set mining feasible by a general solver with long execution time that supports heuristic search and are limited to small datasets only. In this paper, we investigate an approach which aims to find diverse set of patterns using genetic algorithm to mine diverse frequent patterns. We propose a fast heuristic search algorithm that outperforms state-of-the-art methods on a standard set of benchmarks and capable to produce satisfactory results within a short period of time. Our proposed algorithm uses a relative encoding scheme for the patterns and an effective twin removal technique to ensure diversity throughout the search.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Constraint Satisfaction and Optimization · Data Management and Algorithms