A Tight Upper Bound on the Number of Candidate Patterns

Floris Geerts; Bart Goethals; Jan Van den Bussche

arXiv:cs/0112007·cs.DB·May 23, 2007·34 cites

A Tight Upper Bound on the Number of Candidate Patterns

Floris Geerts, Bart Goethals, Jan Van den Bussche

PDF

Open Access

TL;DR

This paper derives a tight upper bound on the maximum number of candidate patterns generated in levelwise frequent pattern mining, aiding in reducing database scans and improving efficiency.

Contribution

It introduces a novel tight upper bound based on classical combinatorial results, enhancing the understanding of candidate pattern generation.

Findings

01

Provides a mathematically proven upper bound for candidate patterns

02

Helps optimize the pattern mining process by reducing unnecessary database scans

03

Connects combinatorial theory with practical data mining algorithms

Abstract

In the context of mining for frequent patterns using the standard levelwise algorithm, the following question arises: given the current level and the current set of frequent patterns, what is the maximal number of candidate patterns that can be generated on the next level? We answer this question by providing a tight upper bound, derived from a combinatorial result from the sixties by Kruskal and Katona. Our result is useful to reduce the number of database scans.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Rough Sets and Fuzzy Logic · Advanced Database Systems and Queries