A Multi-View Framework to Detect Redundant Activity Labels for More Representative Event Logs in Process Mining
Qifan Chen, Yang Lu, Charmaine S. Tam, Simon K. Poon

TL;DR
This paper introduces a multi-view framework that automatically detects redundant activity labels in event logs, improving the quality of process mining by reducing label inconsistencies through semantic and contextual analysis.
Contribution
It presents a novel multi-view approach combining context-aware and semantic features to efficiently identify redundant activity labels in event logs.
Findings
Effective detection of redundant labels in publicly available datasets
Redundant label detection improves event log quality
Approach works well even with low-occurrence labels
Abstract
Process mining aims to gain knowledge of business processes via the discovery of process models from event logs generated by information systems. The insights revealed from process mining heavily rely on the quality of the event logs. Activities extracted from different data sources or the free-text nature within the same system may lead to inconsistent labels. Such inconsistency would then lead to redundancy in activity labels, which refer to labels that have different syntax but share the same behaviours. Redundant activity labels could introduce unnecessary complexities to the event logs. The identifications of these labels from data-driven process discovery are difficult and rely heavily on human intervention. Neither existing process discovery algorithms nor event data preprocessing techniques can solve such redundancy efficiently. In this paper, we propose a multi-view approach to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
