Can recurrent neural networks learn process model structure?

Jari Peeperkorn; Seppe vanden Broucke; Jochen De Weerdt

arXiv:2212.06430·cs.LG·December 14, 2022

Can recurrent neural networks learn process model structure?

Jari Peeperkorn, Seppe vanden Broucke, Jochen De Weerdt

PDF

1 Repo

TL;DR

This paper evaluates the ability of LSTM-based recurrent neural networks to learn the underlying process model structure from event logs, revealing limitations and conditions that affect their learning and generalization capabilities.

Contribution

It introduces an evaluation framework for assessing LSTM learning of process models and investigates factors influencing their effectiveness, such as overfitting measures and process complexity.

Findings

01

LSTMs struggle to learn process model structure even with simple data.

02

Overfitting countermeasures can improve learning but are not optimal when tuned only for prediction accuracy.

03

Reducing information during training sharply decreases generalization and precision.

Abstract

Various methods using machine and deep learning have been proposed to tackle different tasks in predictive process monitoring, forecasting for an ongoing case e.g. the most likely next event or suffix, its remaining time, or an outcome-related variable. Recurrent neural networks (RNNs), and more specifically long short-term memory nets (LSTMs), stand out in terms of popularity. In this work, we investigate the capabilities of such an LSTM to actually learn the underlying process model structure of an event log. We introduce an evaluation framework that combines variant-based resampling and custom metrics for fitness, precision and generalization. We evaluate 4 hypotheses concerning the learning capabilities of LSTMs, the effect of overfitting countermeasures, the level of incompleteness in the training set and the level of parallelism in the underlying process model. We confirm that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jaripeeperkorn/lstm_process_model_structure
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory