What Averages Do Not Tell -- Predicting Real Life Processes with Sequential Deep Learning
Istv\'an Ketyk\'o, Felix Mannhardt, Marwan Hassani, Boudewijn van, Dongen

TL;DR
This paper evaluates seven deep learning models for predicting outcomes in process mining, highlighting the challenges posed by real-life event log complexities and the need for improved sequence modeling techniques.
Contribution
It introduces a comprehensive framework for comparing sequential deep learning architectures on real-life process data, emphasizing the importance of handling data skewness and diverse structures.
Findings
Sequence modeling performance varies across datasets.
Current models struggle with complex, skewed event logs.
Room for improvement in consistent prefix prediction.
Abstract
Deep Learning is proven to be an effective tool for modeling sequential data as shown by the success in Natural Language, Computer Vision and Signal Processing. Process Mining concerns discovering insights on business processes from their execution data that are logged by supporting information systems. The logged data (event log) is formed of event sequences (traces) that correspond to executions of a process. Many Deep Learning techniques have been successfully adapted for predictive Process Mining that aims to predict process outcomes, remaining time, the next event, or even the suffix of running traces. Traces in Process Mining are multimodal sequences and very differently structured than natural language sentences or images. This may require a different approach to processing. So far, there has been little focus on these differences and the challenges introduced. Looking at suffix…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Data Quality and Management · Topic Modeling
