Mind the Gap: A Formal Investigation of the Relationship Between Log and Model Complexity -- Extended Version
Patrizia Schalk, Artem Polyvyanyy

TL;DR
This paper investigates the relationship between event log complexity and process model complexity, revealing limited predictive power of existing measures and highlighting the need for better guidelines in process discovery.
Contribution
The study systematically analyzes multiple log and model complexity measures across various algorithms, providing insights into their relationships and limitations.
Findings
Only flower model complexity can be reliably predicted by log complexity.
Most log complexity measures do not effectively predict model complexity.
Current measures are insufficient for guiding the choice of process discovery algorithms.
Abstract
Simple process models are key for effectively communicating the outcomes of process mining. An important question in this context is whether the complexity of event logs used as inputs to process discovery algorithms can serve as a reliable indicator of the complexity of the resulting process models. Although various complexity measures for both event logs and process models have been proposed in the literature, the relationship between input and output complexity remains largely unexplored. In particular, there are no established guidelines or theoretical foundations that explain how the complexity of an event log influences the complexity of the discovered model. This paper examines whether formal guarantees exist such that increasing the complexity of event logs leads to increased complexity in the discovered models. We study 18 log complexity measures and 17 process model complexity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Data Mining Algorithms and Applications · Software System Performance and Reliability
