# Mining Closed Episodes with Simultaneous Events

**Authors:** Nikolaj Tatti, Boris Cule

arXiv: 1904.08741 · 2019-04-19

## TL;DR

This paper introduces a novel method for discovering closed general episodes in sequential data, including simultaneous events, addressing limitations of traditional episode representations and closure definitions.

## Contribution

It extends episode definitions to include simultaneous events and proposes an efficient algorithm for discovering closed episodes with reduced redundancy.

## Key findings

- Algorithm is efficient on synthetic datasets.
- Closed episodes reduce output redundancy.
- Method is effective on real-world data.

## Abstract

Sequential pattern discovery is a well-studied field in data mining. Episodes are sequential patterns describing events that often occur in the vicinity of each other. Episodes can impose restrictions to the order of the events, which makes them a versatile technique for describing complex patterns in the sequence. Most of the research on episodes deals with special cases such as serial, parallel, and injective episodes, while discovering general episodes is understudied.   In this paper we extend the definition of an episode in order to be able to represent cases where events often occur simultaneously. We present an efficient and novel miner for discovering frequent and closed general episodes. Such a task presents unique challenges. Firstly, we cannot define closure based on frequency. We solve this by computing a more conservative closure that we use to reduce the search space and discover the closed episodes as a postprocessing step. Secondly, episodes are traditionally presented as directed acyclic graphs. We argue that this representation has drawbacks leading to redundancy in the output. We solve these drawbacks by defining a subset relationship in such a way that allows us to remove the redundant episodes. We demonstrate the efficiency of our algorithm and the need for using closed episodes empirically on synthetic and real-world datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.08741/full.md

## Figures

17 figures with captions in the complete paper: https://tomesphere.com/paper/1904.08741/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1904.08741/full.md

---
Source: https://tomesphere.com/paper/1904.08741