Leveraging Data Augmentation and Siamese Learning for Predictive Process Monitoring
Sjoerd van Straten, Alessandro Padella, Marwan Hassani

TL;DR
This paper introduces SiamSA-PPM, a self-supervised framework that uses data augmentation and Siamese learning to improve predictive process monitoring by generating realistic trace variants and learning robust representations.
Contribution
It proposes three novel statistically grounded augmentation methods combined with Siamese learning for better process prediction without labeled data.
Findings
Outperforms state-of-the-art in next activity prediction
Significantly improves variability and robustness of models
Statistical augmentation outperforms random transformations
Abstract
Predictive Process Monitoring (PPM) enables forecasting future events or outcomes of ongoing business process instances based on event logs. However, deep learning PPM approaches are often limited by the low variability and small size of real-world event logs. To address this, we introduce SiamSA-PPM, a novel self-supervised learning framework that combines Siamese learning with Statistical Augmentation for Predictive Process Monitoring. It employs three novel statistically grounded transformation methods that leverage control-flow semantics and frequent behavioral patterns to generate realistic, semantically valid new trace variants. These augmented views are used within a Siamese learning setup to learn generalizable representations of process prefixes without the need for labeled supervision. Extensive experiments on real-life event logs demonstrate that SiamSA-PPM achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
