Reductions for Frequency-Based Data Mining Problems
Stefan Neumann, Pauli Miettinen

TL;DR
This paper introduces a new reduction method to compare the computational complexity of frequency-based data mining problems across different domains, revealing that constraints can simplify many complex problems.
Contribution
It extends existing complexity comparisons to a broader range of data mining problems and shows how constraints can reduce their computational complexity.
Findings
Complexity of many maximal frequent pattern problems collapses with constraints
Extends Kimelfeld and Kolaitis's results to more problems
Potential for more efficient algorithms due to complexity reductions
Abstract
Studying the computational complexity of problems is one of the - if not the - fundamental questions in computer science. Yet, surprisingly little is known about the computational complexity of many central problems in data mining. In this paper we study frequency-based problems and propose a new type of reduction that allows us to compare the complexities of the maximal frequent pattern mining problems in different domains (e.g. graphs or sequences). Our results extend those of Kimelfeld and Kolaitis [ACM TODS, 2014] to a broader range of data mining problems. Our results show that, by allowing constraints in the pattern space, the complexities of many maximal frequent pattern mining problems collapse. These problems include maximal frequent subgraphs in labelled graphs, maximal frequent itemsets, and maximal frequent subsequences with no repetitions. In addition to theoretical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
