
TL;DR
This paper proposes a novel graph database approach for process mining that enables scalable analysis of large event logs by computing Directly Follows Graphs within the database, improving performance and data management.
Contribution
It introduces a new method to store and process event logs in graph databases, shifting heavy computations into the database to handle large-scale data efficiently.
Findings
Enables DFG computation on data larger than memory capacity.
Achieves better performance with data chunking.
Demonstrates effectiveness using real log data.
Abstract
Process mining is an area of research that supports discovering information about business processes from their execution event logs. The increasing amount of event logs in organizations challenges current process mining techniques, which tend to load data into the memory of a computer. This issue limits the organizations to apply process mining on a large scale and introduces risks due to the lack of data management capabilities. Therefore, this paper introduces and formalizes a new approach to store and retrieve event logs into/from graph databases. It defines an algorithm to compute Directly Follows Graph (DFG) inside the graph database, which shifts the heavy computation parts of process mining into the graph database. Calculating DFG in graph databases enables leveraging the graph databases' horizontal and vertical scaling capabilities in favor of applying process mining on a large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
