Identifying the Key Attributes in an Unlabeled Event Log for Automated Process Discovery
Kentaroh Toyoda, Rachel Gan Kai Ying, Allan NengSheng Zhang, Tan Puay, Siew

TL;DR
This paper presents a two-stage machine learning approach to automatically identify key attributes in unlabeled event logs, significantly reducing manual effort and computational complexity for process discovery.
Contribution
The authors propose a novel two-stage method combining supervised learning and process model evaluation to automate key attribute identification in event logs, enhancing process mining automation.
Findings
Method reduces computational complexity from O(N^3) to O(k^3).
Successfully identified key attributes in 14 datasets within 20 seconds.
Effective even with only 2 candidate attributes in the first stage.
Abstract
Process mining discovers and analyzes a process model from historical event logs. The prior art methods use the key attributes of case-id, activity, and timestamp hidden in an event log as clues to discover a process model. However, a user needs to specify them manually, and this can be an exhaustive task. In this paper, we propose a two-stage key attribute identification method to avoid such a manual investigation, and thus this is a step toward fully automated process discovery. One of the challenging tasks is how to avoid exhaustive computation due to combinatorial explosion. For this, we narrow down candidates for each key attribute by using supervised machine learning in the first stage and identify the best combination of the key attributes by discovering process models and evaluating them in the second stage. Our computational complexity can be reduced from to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Data Quality and Management · Semantic Web and Ontologies
