Leveraging Data Augmentation and Siamese Learning for Predictive Process Monitoring

Sjoerd van Straten; Alessandro Padella; Marwan Hassani

arXiv:2507.18293·cs.LG·February 20, 2026

Leveraging Data Augmentation and Siamese Learning for Predictive Process Monitoring

Sjoerd van Straten, Alessandro Padella, Marwan Hassani

PDF

TL;DR

This paper introduces SiamSA-PPM, a self-supervised framework that uses data augmentation and Siamese learning to improve predictive process monitoring by generating realistic trace variants and learning robust representations.

Contribution

It proposes three novel statistically grounded augmentation methods combined with Siamese learning for better process prediction without labeled data.

Findings

01

Outperforms state-of-the-art in next activity prediction

02

Significantly improves variability and robustness of models

03

Statistical augmentation outperforms random transformations

Abstract

Predictive Process Monitoring (PPM) enables forecasting future events or outcomes of ongoing business process instances based on event logs. However, deep learning PPM approaches are often limited by the low variability and small size of real-world event logs. To address this, we introduce SiamSA-PPM, a novel self-supervised learning framework that combines Siamese learning with Statistical Augmentation for Predictive Process Monitoring. It employs three novel statistically grounded transformation methods that leverage control-flow semantics and frequent behavioral patterns to generate realistic, semantically valid new trace variants. These augmented views are used within a Siamese learning setup to learn generalizable representations of process prefixes without the need for labeled supervision. Extensive experiments on real-life event logs demonstrate that SiamSA-PPM achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.