Ranking Episodes using a Partition Model

Nikolaj Tatti

arXiv:1902.01002·cs.DS·February 5, 2019

Ranking Episodes using a Partition Model

Nikolaj Tatti

PDF

TL;DR

This paper introduces a partition model for ranking episodes in sequential data to filter out redundant patterns, improving the efficiency of pattern mining by reducing false positives like freerider patterns.

Contribution

It develops a novel partition model for episodes that accounts for event order restrictions and computes expected support using sophisticated methods, enhancing pattern ranking.

Findings

01

Effective reduction of redundant episodes in experiments

02

Improved pattern ranking accuracy

03

Efficient filtering of freerider patterns

Abstract

One of the biggest setbacks in traditional frequent pattern mining is that overwhelmingly many of the discovered patterns are redundant. A prototypical example of such redundancy is a freerider pattern where the pattern contains a true pattern and some additional noise events. A technique for filtering freerider patterns that has proved to be efficient in ranking itemsets is to use a partition model where a pattern is divided into two subpatterns and the observed support is compared to the expected support under the assumption that these two subpatterns occur independently. In this paper we develop a partition model for episodes, patterns discovered from sequential data. An episode is essentially a set of events, with possible restrictions on the order of events. Unlike with itemset mining, computing the expected support of an episode requires surprisingly sophisticated methods. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.