Case ID detection based on time series data -- the mining use case
Edyta Brzychczy, Tomasz Pe{\l}ech-Pilichowski, Ziemowit Dworakowski

TL;DR
This paper introduces a rule-based algorithm that detects case IDs in time series sensor data for process mining, achieving high accuracy in industrial datasets with minimal manual labeling.
Contribution
The paper presents a novel method for identifying case IDs from time series data without explicit labels, enabling process mining in industrial contexts.
Findings
F1 score of 96.8% on datasets with outliers
F1 score of 97% on datasets without outliers
F1 score of 92.6% on manufacturing data
Abstract
Process mining gains increasing popularity in business process analysis, also in heavy industry. It requires a specific data format called an event log, with the basic structure including a case identifier (case ID), activity (event) name, and timestamp. In the case of industrial processes, data is very often provided by a monitoring system as time series of low level sensor readings. This data cannot be directly used for process mining since there is no explicit marking of activities in the event log, and sometimes, case ID is not provided. We propose a novel rule-based algorithm for identification patterns, based on the identification of significant changes in short-term mean values of selected variable to detect case ID. We present our solution on the mining use case. We compare computed results (identified patterns) with expert labels of the same dataset. Experiments show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Anomaly Detection Techniques and Applications
