Efficiently Summarising Event Sequences with Rich Interleaving Patterns
Apratim Bhattacharyya, Jilles Vreeken

TL;DR
This paper introduces extit{ extbf{ extcolor{blue}{ extbf{ extit{ourmethod}}}}}, a fast and effective MDL-based approach for summarising sequential data with complex interleaving patterns, improving model quality and interpretability.
Contribution
We propose extit{ extbf{ extcolor{blue}{ extbf{ extit{ourmethod}}}}}, a novel greedy MDL-based algorithm that efficiently captures rich, interleaving patterns in sequential data for better summarisation.
Findings
extit{ extbf{ extcolor{blue}{ extbf{ extit{ourmethod}}}}} is significantly faster than existing methods.
It produces more accurate and meaningful models.
It discovers patterns that reveal multiple choices of values in sequences.
Abstract
Discovering the key structure of a database is one of the main goals of data mining. In pattern set mining we do so by discovering a small set of patterns that together describe the data well. The richer the class of patterns we consider, and the more powerful our description language, the better we will be able to summarise the data. In this paper we propose \ourmethod, a novel greedy MDL-based method for summarising sequential data using rich patterns that are allowed to interleave. Experiments show \ourmethod is orders of magnitude faster than the state of the art, results in better models, as well as discovers meaningful semantics in the form patterns that identify multiple choices of values.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Algorithms and Data Compression · Advanced Database Systems and Queries
