An empirical investigation of different classifiers, encoding and ensemble schemes for next event prediction using business process event logs
Bayu Adhi Tama, Marco Comuzzi, Jonghyeon Ko

TL;DR
This paper provides an empirical benchmark for next event prediction in business process logs, analyzing how encoding windows and ensemble schemes affect classifier performance, aiding in optimal method selection.
Contribution
It extends previous benchmarks by evaluating the impact of encoding windows and ensemble schemes on classifier performance for next event prediction.
Findings
Choosing the right number of events for encoding is challenging.
Ensemble schemes improve low-performing classifiers like SVM.
High-performing classifiers like tree-based models are less affected by ensembles.
Abstract
There is a growing need for empirical benchmarks that support researchers and practitioners in selecting the best machine learning technique for given prediction tasks. In this paper, we consider the next event prediction task in business process predictive monitoring and we extend our previously published benchmark by studying the impact on the performance of different encoding windows and of using ensemble schemes. The choice of whether to use ensembles and which scheme to use often depends on the type of data and classification task. While there is a general understanding that ensembles perform well in predictive monitoring of business processes, next event prediction is a task for which no other benchmarks involving ensembles are available. The proposed benchmark helps researchers to select a high performing individual classifier or ensemble scheme given the variability at the case…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Data Stream Mining Techniques · Data Quality and Management
MethodsSupport Vector Machine
