Determining Window Sizes using Species Estimation for Accurate Process Mining over Streams
Christian Imenkamp, Martin Kabierski, Hendrik Reiter, Matthias Weidlich, Wilhelm Hasselbring, Agnes Koschmider

TL;DR
This paper introduces a dynamic window sizing method for streaming process mining, leveraging species estimation techniques to improve analysis accuracy and robustness amid concept drifts in real-time event streams.
Contribution
It proposes a novel approach that dynamically adjusts window sizes using species estimation estimators, addressing limitations of static windows in process mining.
Findings
Improved accuracy over static window approaches
Enhanced robustness to concept drifts
Effective on real-world datasets
Abstract
Streaming process mining deals with the real-time analysis of event streams. A common approach for it is to adopt windowing mechanisms that select event data from a stream for subsequent analysis. However, the size of these windows denotes a crucial parameter, as it influences the representativeness of the window content and, by extension, of the analysis results. Given that process dynamics are subject to changes and potential concept drift, a static, fixed window size leads to inaccurate representations that introduce bias in the analysis. In this work, we present a novel approach for streaming process mining that addresses these limitations by adjusting window sizes. Specifically, we dynamically determine suitable window sizes based on estimators for the representativeness of samples as developed for species estimation in biodiversity research. Evaluation results on real-world data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
