EdgeMiner: Distributed Process Mining at the Data Sources
Julia Andersen, Patrick Rathje, Christian Imenkamp, Agnes Koschmider,, Olaf Landsiedel

TL;DR
EdgeMiner is a distributed process mining algorithm that operates directly on sensor nodes in real-time, enhancing scalability and privacy by reducing data centralization and communication overhead.
Contribution
It introduces a novel distributed process mining approach that works on sensor nodes, significantly reducing communication and storage needs compared to centralized methods.
Findings
Reduces communication overhead by up to 96%.
Queries stabilize after few events, querying less than 2.5% nodes.
Proves correctness of the distributed algorithm.
Abstract
Process mining is moving beyond mining traditional event logs and nowadays includes, for example, data sourced from sensors in the Internet of Things (IoT). The volume and velocity of data generated by such sensors makes it increasingly challenging to efficiently process the data by traditional process discovery algorithms, which operate on a centralized event log. This paper presents EdgeMiner, an algorithm for distributed process mining operating directly on sensor nodes on a stream of real-time event data. In contrast to centralized algorithms, EdgeMiner tracks each event and its predecessor and successor events directly on the sensor node where the event is sensed and recorded. As EdgeMiner aggregates direct successions on the individual nodes, the raw data does not need to be stored centrally, thus improving both scalability and privacy. We analytically and experimentally show the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Semantic Web and Ontologies · Big Data and Business Intelligence
