Using Sampling Strategy to Assist Consensus Sequence Analysis

Zhichao Xu; Shuhong Chen

arXiv:2008.08300·cs.AI·July 5, 2021

Using Sampling Strategy to Assist Consensus Sequence Analysis

Zhichao Xu, Shuhong Chen

PDF

Open Access

TL;DR

This paper introduces a new sampling strategy to determine the optimal number of traces needed for accurate consensus sequence analysis in process mining, aiding experts in model adjustment.

Contribution

A novel sampling method for estimating the trace count required for representative consensus sequences in process mining.

Findings

01

Effective estimation of process similarity using the proposed sampling strategy

02

Application to real-world datasets demonstrates practical utility

03

Sample curve fitting aids understanding of the methodology

Abstract

Consensus Sequences of event logs are often used in process mining to quickly grasp the core sequence of events to be performed in a process, or to represent the backbone of the process for doing other analyses. However, it is still not clear how many traces are enough to properly represent the underlying process. In this paper, we propose a novel sampling strategy to determine the number of traces necessary to produce a representative consensus sequence. We show how to estimate the difference between the predefined Expert Model and the real processes carried out. This difference level can be used as reference for domain experts to adjust the Expert Model. In addition, we apply this strategy to several real-world workflow activity datasets as a case study. We show a sample curve fitting task to help readers better understand our proposed methodology.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Advanced Database Systems and Queries · Anomaly Detection Techniques and Applications