Reductions for Frequency-Based Data Mining Problems

Stefan Neumann; Pauli Miettinen

arXiv:1709.00900·cs.CC·September 5, 2017

Reductions for Frequency-Based Data Mining Problems

Stefan Neumann, Pauli Miettinen

PDF

TL;DR

This paper introduces a new reduction method to compare the computational complexity of frequency-based data mining problems across different domains, revealing that constraints can simplify many complex problems.

Contribution

It extends existing complexity comparisons to a broader range of data mining problems and shows how constraints can reduce their computational complexity.

Findings

01

Complexity of many maximal frequent pattern problems collapses with constraints

02

Extends Kimelfeld and Kolaitis's results to more problems

03

Potential for more efficient algorithms due to complexity reductions

Abstract

Studying the computational complexity of problems is one of the - if not the - fundamental questions in computer science. Yet, surprisingly little is known about the computational complexity of many central problems in data mining. In this paper we study frequency-based problems and propose a new type of reduction that allows us to compare the complexities of the maximal frequent pattern mining problems in different domains (e.g. graphs or sequences). Our results extend those of Kimelfeld and Kolaitis [ACM TODS, 2014] to a broader range of data mining problems. Our results show that, by allowing constraints in the pattern space, the complexities of many maximal frequent pattern mining problems collapse. These problems include maximal frequent subgraphs in labelled graphs, maximal frequent itemsets, and maximal frequent subsequences with no repetitions. In addition to theoretical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.