TL;DR
This paper evaluates the ability of LSTM-based recurrent neural networks to learn the underlying process model structure from event logs, revealing limitations and conditions that affect their learning and generalization capabilities.
Contribution
It introduces an evaluation framework for assessing LSTM learning of process models and investigates factors influencing their effectiveness, such as overfitting measures and process complexity.
Findings
LSTMs struggle to learn process model structure even with simple data.
Overfitting countermeasures can improve learning but are not optimal when tuned only for prediction accuracy.
Reducing information during training sharply decreases generalization and precision.
Abstract
Various methods using machine and deep learning have been proposed to tackle different tasks in predictive process monitoring, forecasting for an ongoing case e.g. the most likely next event or suffix, its remaining time, or an outcome-related variable. Recurrent neural networks (RNNs), and more specifically long short-term memory nets (LSTMs), stand out in terms of popularity. In this work, we investigate the capabilities of such an LSTM to actually learn the underlying process model structure of an event log. We introduce an evaluation framework that combines variant-based resampling and custom metrics for fitness, precision and generalization. We evaluate 4 hypotheses concerning the learning capabilities of LSTMs, the effect of overfitting countermeasures, the level of incompleteness in the training set and the level of parallelism in the underlying process model. We confirm that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory
