TL;DR
This paper introduces DESQ, a unified framework for frequent sequence mining with various subsequence constraints, enabling more flexible and efficient pattern discovery in real-world datasets.
Contribution
The paper proposes a set of pattern expressions and algorithms that unify and efficiently handle multiple subsequence constraints in a single framework.
Findings
Algorithms are competitive with state-of-the-art methods.
Unified approach improves usability for practitioners.
Effective in real-world datasets.
Abstract
Frequent sequence mining methods often make use of constraints to control which subsequences should be mined. A variety of such subsequence constraints has been studied in the literature, including length, gap, span, regular-expression, and hierarchy constraints. In this paper, we show that many subsequence constraints---including and beyond those considered in the literature---can be unified in a single framework. A unified treatment allows researchers to study jointly many types of subsequence constraints (instead of each one individually) and helps to improve usability of pattern mining systems for practitioners. In more detail, we propose a set of simple and intuitive "pattern expressions" to describe subsequence constraints and explore algorithms for efficiently mining frequent subsequences under such general constraints. Our algorithms translate pattern expressions to compressed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
