The Complexity of Maximal/Closed Frequent Tree Mining for Bounded Height Trees
Kenta Komoto, Kazuhiro Kurita, Hirotaka Ono

TL;DR
This paper investigates the computational complexity of enumerating maximal and closed frequent trees with bounded height, revealing that small height bounds do not necessarily simplify the problem.
Contribution
It provides a height-based classification of the complexity for closed and maximal frequent tree mining, including polynomial-delay algorithms for specific cases.
Findings
Polynomial-delay algorithm for unordered trees of height at most 2
No output-polynomial algorithm for ordered trees of height at most 2 unless P=NP
Enumeration remains hard even for very small height bounds
Abstract
Frequent tree mining asks us to enumerate tree patterns that occur frequently in a database of rooted trees. This problem is motivated by tree-structured data in bioinformatics, such as glycans and pseudoknot-free RNA secondary structures. A direct enumeration of all frequent trees is often highly redundant, because every subtree of a frequent tree is again frequent. Closed and maximal frequent trees are standard ways to reduce this redundancy, but their enumeration can still be computationally hard. In this paper, we study the effect of bounding the height of the input trees. This is a natural restriction for rooted trees, since the height is the depth of the hierarchy. We ask whether closed/maximal frequent tree mining remains hard when every input tree has a small height. Our results show that the answer depends sharply on the model. For rooted unordered trees of height at most 2,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Rough Sets and Fuzzy Logic · Graph Theory and Algorithms
